March 31, 2003
Back to userland, yay! Now time to further design sn_user.
I've got a file (/dev/snc0) that will read complete IP packets. It will
block or return -EAGAIN in non-blocking mode, so I can safely assume that if
I ever come off read() I'll have the entire IP packet as delivered to
ethernet. Cool. No wait, sn_socket_file will concatenate packets in its
readbuf, and then sn_protocol_direct will create sn_object_raw's with
concatenated data, and there are WAY too many malloc/memcpy's in that stream.
No, I think I'm going to have to pull the data out of the file directly with
a poll() in sn_user. (I don't mind having so many memcpy's in the TCP stream
since they go straight to the appropriate objects. The sn_object_raw
pattern is too wasteful to use on the "ethernet" device.
Integrating with non-ShiftyNet applications, then...
My original idea was to do the SVC connect during a "DNS lookup". When the
SVC is complete, the lookup will return an IP address in the same network as
the virtual device, and after that all packets will be directly tunneled. I
still like this idea. A more general approach:
let S be the IP address associated with the virstual device
any packets with dest=S are directly handled by sn_user:
accept port 53/udp (DNS)
all others dropped
for any packet received by the device with dest!=S:
if dest is known, packet is passed through the tunnel
if dest is not known, packet is dropped
March 30, 2003
OK, I'm still getting my kernel oops, this time from the immediate task
queue. I got educated on the difference between "interrupt time" and "user
system time", which is why I kept fucking up the network side wake-up call.
I'm trying to put the netif_queue_start() call in a task queue and just got
a crash, so now I'm going to hunt for calls to netif_start_queue() in the
char drivers, surely SOMEONE out there is doing something similar?
Wait a sec...can I have been stupid? Now I see netif_wake_queue(), what
does that do? Hmm, apparently it kills the machine with a kernel oops. I'm
at my wits' end, time to beg for help.
Got a response from Neil Horman pointing me to the
TUN/TAP project. I was on the right track, but got my solutions mixed up.
Turns out I CAN call netif_start_queue() IF BH is disabled. (This hint came
from TUN, as they don't do anything special really except use the trick from
Linux Device Drivers under "Going to Sleep Without Races".) I'm guessing
that netif_start_queue() immediately caused a context switch to the
interrupt side, not sure why we oopsed on it though, maybe a double lock?
Anyway we seem to be working for now and I'M NOT gonna bug with it for at
least three more days. :-)
March 28, 2003
Having a kernel oops around the interruptible_sleep_on() call in
sn_ctl_read(), not sure why, but otherwise we're only one function away from
a working packet driver. By now it is so far off from ethertap that I'm not
worried at all about GPL-ness, yay.
Oops, wait_queue_head_t needs to be initialized before it's used, so the
blocking part works now in sn_ctl_read(), BUT I get a kernel oops when I try
to call netif_queue_start() in there. I suspect it's because I'm obviously
in the wrong process space to do it (the process calling sn_ctl_read() is
not the same process calling sn_eth_xmit()), but how do I get these
inter-process blocking mechanisms to work? OK, I have a theory: I can let
the sn_eth_xmit() call netif_queue_stop() and THEN call
interruptible_sleep_on(), so that not only is IT blocked on the network send
but no one else can use the interface. Then sn_ctl_read() can call
wake_up_interruptible(), and sn_eth_xmit() can call netif_start_queue() on
its way out.
My weekend officially begins...NOW.
March 27, 2003
I'm working on the packet driver now. Ethertap has been illuminating,
though I'm trying to make sure I'm not "infecting" my Public Domain code
with GPL code. Not that I care for me, since PD can obviously go into GPL,
but if someone were to try and incorporate from me into a commercial project
they could get in trouble if I'm not careful.
Well, I've got the original kernel oops on the insmod fixed, moved the
buffers to sn_ctl_entry, put in the first memcpy's between the network
functions and the file functions, and got a new kernel oops somewhere
between sn_eth_xmit and sn_ctl_read. :)
I'm going to pause now and soak in the tub.
Got a packet from ping to be visible to test_ctl.c. Woo! Now to add
blocking read() and write() support and spinlocks around the network
functions and we'll be essentially done with the packet driver.
March 24, 2003
USER now recognizes an incoming SVC connection. Working on the right->left
ACK sequence. Cute, now I see why I had both ID and NONCE: ID has to be
stripped out on the reverse direction (which I'm not) or else the right-side
router will think the response is a new connection request. Ok, time to
sleep and pick this up tomorrow.
Moved the memcpy's in the applications to copy constructors for
sn_object_login and sn_object_connect, removed sn_object_login->
copy_header(), things are easier now.
Fixed up the ID passing, left side looks pretty good on the connect, right
side is close. To complete the protocol I'll need the nexus sidelinks, the
random choice code, and a couple more states for extending the chain size.
I'm going to switch tracks now and work on the packet driver. Once that's
in place I'm going to focus entirely on stability, getting the failure cases
to behave and building a better test harness.
March 20, 2003
Renamed the VPN to SVC classes/constants. Flushing out the right-side
router portion of the connect sequence. It's coming back to me
now...everytime I get back into this code I keep seeing how close I was the
last time. Sheesh. Now we've got the connect object containing the
original connection request so it can be passed to the right-side target.
I'll eventually encrypt the interior object so that intervening nexi can't
pair up the routers with the users.
Now I'm down to the accept portion on USER. Gonna grab a ciggie and hit the
mall.
March 19, 2003
I am adding support for the Boehm garbage collector
(http://www.hpl.hp.com/personal/Hans_Boehm/gc). It's VERY nice I'll say,
especially with the leak detection (which of course we've got lots of).
Having some trouble with the C++ parts though, but hopefully it'll get
ironed out shortly.
Got it working I think. Needed to use the Makefile.direct and NOT set
-DPARALLEL_MARK, verify that it got through the c++ tests before using it.
Also forgot to include the GC in the other Makefiles resulting in a "pure
virtual function call" abend. Ooopsy.
Also, the Linux ethertap driver appears to be exactly what I was writing
last July. I'm going to examine the source to get my driver working
correctly, but I'll still keep my driver in for supporting pre-2.4 kernels.
My driver can also do better application-specific reporting in /proc.
So, current TODO:
Complete the router->nexus->root->nexus2->router2 connection path
Get sn_ctl working, look at Ethertap driver
Begin nexus tree
Long-range TODO:
Switch to MySQL
Mothball the Platypus code (that project is officially dead)
Switch to OpenSSL
Begin file transfer app
July 6, 2002
Well, a kernel function *can't* be called from userland without some serious
dicking with the source, System.map, etc., so I've decided instead to be
managed via a special file defaulted to /dev/sn_ctl with a dynamic major
number. It's a pretty simple API and the nice thing is it'll be rather
OS-independent from the ShiftyNet binary's point of view. The module is
loaded by root of course, but it won't permit root to open it, instead it
has to be opened by a hardcoded UID. It puts an entry in /proc and prevents
more than one user to access it. Not bad so far, now I want to wrap the
control API, plug it into ShiftyNet, and then get the network driver in.
Kernel programming is pretty cool. Very different from OOP. If you screw up
you've got to reboot to get to a completely known state, but then again
linux is still chugging along even if whole sections of /proc are causing
oops'es.
July 5, 2002
We've got 85% of the SVC connect done. Enough to begin work on the packet
driver. "Boo". "Ya".
Got a basic module to load and unload in linux. Defined a very simple API
to handle the packets. Now I need to figure out how to call a kernel
function from userland.
July 3, 2002
As per my usual, it's a holiday and I've nothing better to do than code with
whatever waking moments I've got left. Gonna try to fix up the connect FSM
this time. A fitting work indeed for Independence Day.
Got the infrastructure in sn_nexus in place. Kept seeing an assertion fire
in sn_queue_object from the router, turns out sn_socket_tcp wasn't calling
pipe->init(). Not sure why it waited for that particular codepath to show
up, maybe it needed more than two TCP connections open to appear. Fixed
though.
Line count. Library: 18723 ShiftyNet: 2832 Total: 21555. Not bad,
redesigns are trimming excess code.
Right side is about 40% there. Left side 0%. Still need the end-point
special cases. Right side and left side are 80% complete. I'm going to
wrap the connect_svc() call and then try to pass some data along it. Once
that's done I'll begin the linux packet driver.
June 10, 2002
Private life (church, work, impending move) consumed most of the last three
weeks. Now I'm back, just added the states to sn_vpn_info for sn_nexus
along with sn_nexus::process_vpn_info() to do that. They should actually
come together rather quickly, the difficult part (as always) will be
figuring out the various failure paths. GAA! The core has reappeared in
sn_root::run! Oops, I had given get_connect_string_from_sa the extra 5
bytes is needed for the unsigned int, but forgot the other 1 for the
terminating \0. Hope that fixes it...look like it did. Good.
May 28, 2002
Grr, found a buffer overrun in sn_network_manager::get_connect_string_from_sa.
Caused ROOT to core when it tried to do a new sn_object_connect(). Got the
sn_object_connect to be seen by NEXUS, now I have the trickier job of doing
the left-right split-connect stuff. That'll take a while I'm
sure...especially as work is kicking up a bit. And I'm reading a Bible
again. Sheesh.
May 27, 2002
Tiny bit of work tonight, we've got the connect_svc() path sending an
sn_object_connect from USER to ROOT. Some bugs to work out...ROUTER is
telling ROOT that it lives at 0.0.0.0:, clearly getting the wrong
struct sockaddr sent in its login sequence. Easy stuff, then on to sending
the initial connect command back down... Actually, it was a bug in
sn_network_manager::get_sa_from_connect_string(). Now it's ok.
May 24-25, 2002
Remembering where this project was... I've been doing a lot of personal
life things lately. Just woke up one day and couldn't get into ShiftyNet,
hopefully this next push will be rather productive. I believe I was working
on getting the VPN connect up, which means ROOT will need a lot more code.
So let me get some errands out of the way and I'll get into it.
I think I see...I had just barely gotten the chain up. Need to link
user_info and router_info so that VPN connect can know how to reach a
particular user name. Got that, pretty simple. Now the VPN connect
object...more complicated. Time to document the VPN connect sequence.
Added sn_vpn_info and sn_object_connect. Terminology conflict between VPN
and SVC in sn_callback/sn_vnos and everywhere else. For now I'll keep them
as is, but I'll rename the classes later, I think SVC is the better way to
go since it isn't technically a tunneled point-to-point link ala PPTP/IPSec,
and I don't want to bump into any "intellectual property issues" with the
commercial vendors. Shit for all I know the Onion article about Microsoft
patenting 1's and 0's might be right. :)
Gotta pause for a moment to do laundry and buy groceries...
April 7, 2002
Added a connect string to sn_object_login. Only USER and ROUTER will use it
for now.
April 6, 2002
Damn, I'm late with rent again. There goes $25. However, at least I'm
coding this weekend. :)
Made the nexus downlinks store a sn_link * rather than sn_async_pipe *, so
now the nexus knows which downlinks are routers and which are other nexi.
What next...?
Renamed sn_user to sn_user_info and added the sn_user application class.
Now we can build a chain of user-->router-->nexus-->root. Now I need to
modify root to keep track of active users and routers, and accept the VPN
connect request.
April 2, 2002
Woohoo! Thirty minutes of cut-and-paste and we've got the router role
running. I got a five-node network up and sure enough the nexi were
proxying the login requests back to root, which passed the authentication
successful message back.
April 1, 2002
Found it, turns out sn_node was matching src pipe id to dest pipe id in the
dispatch loop, which was broken because sn_cluster_manager already had those
mirrored correctly. Didn't see it for sn_cluster_manager's "evil twin"
connection since those are hardcoded to pipe ID (s,d) of (1,1). So now I'm
almost halfway through the login code...
Got the nexus login procedure dummied in. Now to hit the gym, then begin
working on the virtual circuit connect logic.
March 31, 2002
Happy Easter! Primitive login is coming along nicely. Nexus is almost up
to dispatching from uplink down also. I'm going to implement an ugly hack
for now: the login procedure will build a callback chain so that the root
can notify a nexus that authenticated successfully. Actually, this ugly
hack might be the nexus login procedure in the end, it depends on some
simulations whether I'll use the login-callback method for users also on
nexi.
Just tried a three-node network and sn_node didn't dispatch my object to the
right pipe. Looks like a cluster manager problem, could be a bitch to
debug, so I'm going to defer until the next CVS commit. Yeah, I see some
pipe ID problems, definitely. I suspect the cluster managers are
"accidentally" somehow uniquifying them across the cluster...or perhaps
across their slice of it.
March 25, 2002
Woah...talk about serious negative deja-vu. I'm writing objects to
encapsulate ye olde "shadow.h" -- at least I think that was the header file
from way back that defined all the "bands" and "messages". Shit it was so
long ago that night I sat down on the Pentium-60 (original 5V chip, with
FDIV bug) and pulled up Borland C++ 5.02 and began writing from an empty
screen. Those first hundred lines are coming back to haunt me, two years
after my degree and four years after their original inception. It's kind of
like...picking up your arithmetic book from sixth grade, after having
finished calculus, and shaking your head at how hard it all seemed then.
I'm creating the VPN command objects. These are all going to be subclassed
from sn_object_vpn rather than do the "roll-it-all-into-one-mega-object"
method ala sn_cluster_command.
March 20-23, 2002
Finally figured out how to link archives with circular references
(--start-group / --end-group options) and moved the ShiftyNet applications
into their own library. So Makefiles are MUCH simpler, yay. Went ahead and
committed the changes. Work is going to begin eating more time in the next
two weeks, so I'm hoping to push a bit in the next couple days. I'm heading
to Washington DC next weekend to see a friend and I'd like to have ROOT and
NEXUS roles working so I can show it off to her.
Writing the user class now. Went ahead and started writing a development
guide for the internal structure also...I hope I'll have at least one more
pro on it by this time next year. Had to add char * support to sn_hashtable
for the first time. Goodie. Now it's on to the NEXUS role...
Code check: ShadowNet 19090 Platypus 3408 Total 22498.
Slogging through the sn_network_manager->connect_uplink() call. I'm
deciding now whether the call should do the entire login process at the
logical VPN layer and everything, or if it should just come back when the
physical link is established. Probably gonna do the former, since it makes
life amazingly easy for the other applications.
I found a minor hole in my connection API: how will sn_network_manager know
which sn_cluster_node_info * got created when it's call to connect_tcp()
succeeded? Damn, I'm going to have to bring it back in
sn_callback_connect_tcp, but that means changes to sn_socket_tcp,
sn_socket_manager, sn_node, etc., to percolate it up. OK I was wrong, I
just passed sn_node * to connect_tcp (cleaner anyway) and only had to modify
sn_socket_tcp::create_node() and sn_socket_manager to set it before
callback->invoke(). sn_network_manager already knows all about
sn_cluster_manager, so I just had it peek into a new hashtable to find the
sn_cluster_node_info *. So there, only about 10 lines change.
Cute. Another bug in sn_protocol_serialize such that the first object
wouldn't be de-serialized because readbuf_object_size was never initialized
to 0. Damn, I am SO not looking forward to running this on actual Unixes
that don't quietly calloc() in the place of malloc().
Wow. I keep surprising myself with all my tricks. sn_cluster_manager's
always know how to talk to each other because they ONLY talk to each other
on different node pipes. Because they are guaranteed via codepath to be the
first pipe on the node, they can always just drop an object on that pipe and
it'll magically dispatch to their evil twin on the other side.
sn_network_manager, however, might not ever be instantiated if
__SHIFTYNET__ is never defined, so they have to use the same methods as
other apps. Now comes the question: should I provide a "find app ID"
function via cluster_manager, or should I just hardcode network_manager's
app ID to something special? Of couse the latter is far easier...sure why
not. If we get like a hundred different projects out there I'll turn it
into a real API. Until then we all know C++.
Another "cute": I keep forgetting that in the run/callback loops I have to
maintain a state for "I already sent off that async call, DON'T DO IT AGAIN!"
Nasty race conditions/loops ensue otherwise. Silly me.
There. Nexus can now get connected to the uplink nexus. I rule.
March 19, 2002
No, I've decided the final roles:
sn_network_manager:
provide connect_uplink() capability
initialize network-level (unauthenticated) crypto at the link level
sn_root:
manage users, keys, global stats, etc.
sn_nexus:
dispatch objects to and from uplink
handle the VPN split-node logic
sn_router:
manage the kernel packet driver
Hmm, I had hoped to ditch most of the clustering code, but alas ShiftyNet
needs to be able to pass sn_connect_app_remote calls, which have been moved
over to the cluster manager (better design anyway, can support multiple
pipes between the same endpoints). So clustering is baaack...and I'll need
to wrap all the exception cases in sn_cluster_manager before I can roll this
out. Ick. Hey wait a sec...was I smart enough to separate the role of
-DNO_CLUSTERING? Looks like I WAS! Wee!
This is beginning to look more straightforward than I was thinking. As in
once I get sn_root and sn_nexus stubbed in things will begin to move very
quickly indeed. There, now sn_nexus calls connect_uplink() to talk to
sn_root on local node. Need to add the sn_root init code (pick up a PGP
key, setup users, etc) and then I can stub in what I need to begin building
the network tree topology.
Damn. libdb2.so fucked up the __throw's with libstdc++. ShiftyNet can no
longer use DB2, which isn't a particularly bad thing but it'll be a headache
come Platypus time. Added user roles, which will undoubtedly change over
time (my terminology is pretty geeky at the moment).
March 18, 2002
Well, there went my "I promise I'll get some ShiftyNet work done this
month!" resolution. I've "upgraded" to RedHat 7.2, which means a better
compiler for exception-handling (yay) and more disk. Alas, DB2 7.2 doesn't
ship with the developer's stuff on Linux, so I had to copy some headers from
6.1 over. Brought the Platypus code up to the new specs, and the demo even
worked again. Wow. I've taken some of the heat off of this project because
of some personal life things coming up, but also because of the research
going on with FreeNet and Gnutella. I'd like to see some more results
before I try to claim the scene -- specifically I want to test my VPN
strategy using the same metrics developed to stress those other projects. A
bit longer before it's ready for consumers, but hey it'll be that much
better, right? Hehe. Ok anyway, let's see where I was at before...
Cool, my #ifdef's protected the Platypus port from picking up ANY ShiftyNet
VPN code. (I know because the ShiftyNet stuff of b0rken. :) )
Design decision time again. Given: object X is destined for ROOT:
1) X --> sn_node
2) sn_node --> sn_nexus (so X will have to be targeted to sn_nexus)
3) a) sn_nexus --> sn_root
OR
b) sn_nexus --> sn_network_manager --> { sn_root | sn_nexus }
Well, one nice thing sn_async_pipe is that it can be either a network or a
local connection, so sn_network_manager can present the same interface to
sn_nexus in both cases where I'M root and where SHE'S root. SO:
sn_network_manager will contain an uplink, and everyone heading there will
use the same pipe. This means alas that sn_network_manager will also have
to dispatch objects to the appropriate downlinks, but that shouldn't be too
bad, since each physical link connect will have to notify sn_network_manager
anyway ala sn_cluster_manager.
February 23, 2002
We added the second developer today! Scott's a long-time friend who's into
all sorts of OOP-isms, and honestly I don't know if he'll have time or
interest in the project, but I'm happy for now that someone else has CVS
access. I'm going to try to pressure him into writing the XML attack tree
editor :)
February 12, 2002
Working on the ShiftyNet application constructors and how they relate to
sn_network_manager. Trying to decide how they should interrelate within a
single node.
For example, in ROLE_ROOT I'll need both sn_root and sn_nexus out there, so
that sn_nexus can receive messages intended for sn_root. Now, I've already
got an API for app<-->app communication, and I could use some
connect_app_local() calls to set up sn_async_pipe's between them. OR given
that everybody can "know" each other directly via pointers, I could just
start off sn_nexus with its own sn_root * and use class methods to pass data
back and forth. The upside is that it's EASY to get them setup, the
downside is what may happen when I want apps to spin multiple threads --
sn_async_pipe makes life a lot nicer in multi-threaded environments. OTOH,
performance is not necessarily an issue, however new API's means more work
in the end.
February 3, 2002
Finally put a tiny bit of effort into moving the "homepage" over to
SourceForge. Now we're back up to par (and a bit more) than the old ISP.
Code size: ShiftyNet 18239 Platypus 3412 Total 21651.
February 2, 2002
Hmm...CryptLib is complaining about the key load code. Digging into it now.
Geez, now I feel stupid. Re-compiled CryptLib with debugging and such so I
could follow a stack into their code, and then catch an assertion firing and
think it's a bug, but then look more closely and realize that cryptInit was
being called AFTER cryptKeysetOpen...stupid! OK, we're working now, so now
let's add some signature/verify code. :-)
Looks like the sig create/check is working...at least the API isn't
complaining. Now what? Gotta get my head back into this...I think I was
trying to get a simple root node up followed by a nexus node. I think the
next big milestone was a simple login process, with "uplinks" and such
automagically being set.
Prehaps I should spend some time figuring out how to handle
exceptions...most of them are immediately heading to terminate() due to
incompatible throw() clauses. Found it, after some searching on Usenet:
libdb2.so has __throw() compiled in, which is borken, adding -lstdc++ ahead
of -ldb2 solves the problem. Now exceptions are being thrown AND caught.
February 1, 2002
Figured out how to load PGP keys, so now I'm writing sn_crypto_keypair for
real. I'll leave the filename/keyid's as passable params via properties,
makes it easy for users to make their own networks. However, the passphrase
for the secret key will have to be prompted for on startup, can't let the
kiddies hardcode that or their entire network comes down. This'll probably
take a couple days.
I figured out this year that I do really well when I've got about three or
four huge projects to work on. Each project moves along at a slow to
moderate pace, but when they all get done at about the same time I add it up
and it's a fucking mountain of code. This is my excuse for why ShiftyNet is
so slow to come up :-) . And every time I dig into this particular code it
tings my Big Life Priorities and I find myself thinking about grad school,
life, God, work, and whatever else comes up. Perhaps that's my mind shying
away from the scope of the entire project...I would never have expected that
20,000 lines would a) do as much as these do and b) be so goddamn hard to
get right. I mean, I'm nowhere near the multi-threaded tests, or the
multi-node tests, or the security analyses, or the P2P apps. It's a hell of
a lot, and I'm doing it mostly so that I can fix the problems that will
arise in my own subversion control network...the one I build after switching
over to the security side of things. Kinda sucks that I can't get paid for
that kind of work until I finish this.
Shit the weather here is NICE these last few days. Makes me glad I
moved...feels like college again. Finished up C.S. Lewis' _Mere
Christianity_, now there's a Brit who has the condescending tone down pat.
CVS update+commit with SourceForge, oops sn_queue_byte doesn't compile on
Slack 8 (gcc 2.95.3), ANSI forbids void * arithmetic. Fix it later...
January 31, 2002
Damn, it's been forever and a year. I forgot entirely where I was with this
project...been busy with a lot of Java and Perl the last few months. We had
a whole freaking lot of snow here a month ago, I was stuck at a friend's for
like six days. So now I'm trying to get my head back into it...I believe I
was about to block main() and figure out how to read PGP keys. That sounds
about right.
December 29, 2001
Fixed sn_queue_byte, it actually had several problems but I think it's
mostly OK now. More rigorous testing in multi-threaded mode etc should
flush out any remaining issues. Whatever, that's not what I'm thinking
about now. I'm deciding how to go about incorporating the ShiftyNet network
pieces into the library. My early policy decisions are now affecting
design.
First question: should a single process handle traffic for multiple
networks? Answer depends on the role. For nexi/servers, it doesn't buy
anything and confuses the issue as to which network panic can shut the
process down. For users, it would be useful, perhaps required, in order to
splice multiple ShiftyNets transparently into the IP stack. However, even
the users have a workaround: put one DNS server in each network and each
network will have a chance to respond...decent caching code within the
ShiftyNet processes can make up for dumbass Mozilla's DNS-ing the same
fucking web site after every fucking click. (No, I'm not bitter, really.)
I originally wanted one process to do ANYTHING because of the Shockwave
Rider worm concept -- lots of surprises waiting in a small piece of code.
But I think now that the design becomes unnecessarily complicated, with too
many rules in the "what do I really do now?" department for the code. So,
we enforce the one network one process tie, meaning ShiftyNet application
code is globally unique within the process space.
Second question: where do I put the ShiftyNet application code? I've
already decided against multiple binary images for the different roles...I'm
not going to care that 60K is dead code at any given moment. Actually, it
would be much better to have a single binary, that way no one can
immediately determine the role being played on their box. (Damnit Kirsten,
I can't even type "box" without giggling, even when "box" is the technically
correct term.) I thought about the design while grabbing some tofu, and I
think this is what I like:
sn_network_manager will be implicitly created when __SHIFTYNET__ is defined.
This class will check the sn_properties in init() and decide if it has
enough data to startup, and it will maintain a global state for the network
(defcon, blacklisted nodes, etc.). It will create the various apps required
during its first run(). This is good because it isolates all the ShiftyNet VPN
functionality from the applications -- no special startup state machine with
a gazillion callbacks required. Applications need only poll
sn_vnos.is_box_ready() until it returns true and then start connecting over
VPN to their hearts content. sn_starter_router can be thrown away.
Should it also multiplex all ShiftyNet data into server/nexus or
router/loginhost etc., and handle the packet collector and DNS lookups?
Hmm...why multiplex here when sn_node does so already? [Later] No, it
shouldn't. I've already got support for everything at the sn_node layer.
So now I'm defining the various roles and applications:
root Network master Secure
nexus Internal tree Secure
login External login Insecure
router Bridge between user/host and nexus Insecure
host HTTP/file server Host
user End-user system Hostile
Secure means no public access is permitted on these nodes.
Hostile means these nodes will be entirely out of my control, i.e.
end-users.
Insecure means these nodes will be seen by hostile/public nodes, so they'll
be immediately visible to packet sniffing.
Host is a special case, as they are "insecure" and have disk access.
The user will define the role, which will automagically land the node in a
network class. So a node that starts as "root" will be able to do nexus
also, likewise router and login. host and user will always be stuck in
their own class -- I don't want to risk a "user" storing host data.
December 19, 2001
Continuing the unit test code. There's a bug in sn_queue_byte...one of
those cases I've been waiting a year+ to test. Geez...I can't even remember
what I was thinking the first time I wrote it. I'm going to procrastinate
today and fix it this weekend.
December 16, 2001
Hmmph. tcsetattr() on a tty used to set the blocking/non-blocking quite
well, that is until I moved the tty to a select()-based manager, and then
ran into "issues" because apparently the O_NONBLOCK flag wasn't being
changed, so read() on STDIN was either returning 0 on no data (when it
should be setting EAGAIN), or it would actually block. So switch over to
the traditional fcntl(O_NONBLOCK) and ignore the tcsetattr and poof! we work
again. Yippie.
Got the TCP code alive again using the new classes. Now sweeping for NLS
messages and TODO's. Threw away another 500 lines with this change, and I'm
not quite so embarrassed by it anymore.
Hahahahaha! Mental note: do not send lines to the logger that will call
functions that call sn_log_entry(). I thought my codepath was screwed when
actually a set_log_level(DEBUG3) was killing my informative message.
Sigh...maybe I should LOCK the logger macros afterall. This is still damn
funny though...
Ok, time for a CVS commit. Added two classes, threw away four (hehe, CVS
put them in the "Attic" :) ). Line check: ShiftyNet 17331 Platypus 3412.
Not a bad day overall. It reduces the cleanup...the most frustrating
feeling is that I-KNOW-I'll-have-to-sweep-this-again-for-error-handling. I
think that's the only reason I am so ruthless with smaller designs, less
reading in the future.
Looking at the date above, I see I'm going to slip my Driver 4 ship date by
several months again. Oooh goodie. Oh well, another year means JXTA will
get closer to completion before I unleash ShiftyNet on the masses, meaning
my performance will be even faster (relativistically speaking).
December 15, 2001
I've been on break from this project for a few weeks...now I'm hunkering
down for some real progress I hope. I'm just a few classes away from a
pre-test codesweep, but the daemon classes are bothering me now. Namely,
they are the oldest code left, with some oddball-isms and I don't like the
design...too much object heirarchy. They're maintaining statistics that I
don't officially need yet, and it may make more sense to ditch the frills
now and add them back later only if I need them.
So let's think about this. A "daemon" is a server process, it waits for
somebody else to initiate a connection and then does whatever. A "client"
is somebody else who initiates connections to a daemon. My daemons are not
quite standard to that definition. Namely, sn_daemon is a sn_runnable that
does all its work asynchronously. It's responsible for many disparate
things: 1) listener socket management (select()); 2) maintaining a list of
sn_socket_tcp's; 3) asynchronous connect requests; 4) statistics. (Yuck,
just saw get_localhost() for the first time in about two years...ooh.) One
of the reasons I hate touching this code is its age, the memory of getting
the damn thing to work the first time, and the uncleanliness of it. So a
redesign feels like more work than it really is. sn_daemon_console is
another weird case: it just loops on its socket list and calls
read/write_physical() which works because sn_socket_file polls itself. Now,
sn_socket has an fd, and except for the listener socket sn_daemon_tcp could
select() for the consoles and files also.
I could move the TCP listener socket support to a sn_socket subclass (not
even tcp), then I could move to a more general sn_daemon_socket that
wouldn't know anything about accept()/bind()/etc, so it could be used for
consoles too. Good. But what about clients? Separating the listener
socket from the daemon turns the daemon into a generalized network manager.
In fact, "daemon" could go away as a class name. I'd like a simple
object-based synchronous client eventually, one that could connect,
handshake, etc. without a bunch of asynchronous sn_vnos calls + callbacks.
That would make starter applications much easier, for example. But I think
separating the function would make that easier too -- I could write a thin
client network manager that is not a runnable so shiftynet_run() need never
be called.
Ok, I think I'm going to do this: sn_daemon_* will be removed,
sn_socket_manager, sn_socket_server_tcp will be created, and TCP support
only determines whether or not sn_vnos adds a new sn_socket_server_tcp to
sn_socket_manager. Nice, fewer classes, less code, but more extensible, and
I retain my "performance."
Slogging through it now. Made socket_tcp a subclass of socket_file, and put
the non-broken read/write_physical's in socket_file, yay. I just thought of
a cool side-effect of the new classes: multiple listener ports (or none) on
one socket_manager. Ugh, if I could only stay alert this weekend and get
into the groove...this is taking way too long.
November 28, 2001
Well, I changed my mind and applied for a project at SourceForge.net today.
However, there are several other ShadowNet's out there, so after much
entertaining brainstorming with Kirsten I settled on a new name: ShiftyNet.
I get to keep the SN prefix on all the code yet alter the ... sociological
context. I've decided now: I'm pushing ShiftyNet to Driver 12 by 4Q 2002,
and I'll take any help to get there.
I've lost faith that the Gubmint gives a rat's ass about us. Corporations
are "cooperating" with Uncle Sam far too easily these days. They are
constructing a vast surveillance network of cameras, forms, and receipts to
track everything we do for both profit (which I don't mind) and law
enforcement (which I mind a lot). I can't stop them in the real world,
where it matters most, but I can contribute a blow for freedom in my own
domain.
The honest truth is that ShiftyNet is all about subversion, and I won't hide
anymore from external judgement. It will enable censor-free publishing,
untraceable Internet attacks, and massive digital piracy. Maybe only two
people will ever use it, but at least I'll be able to say I didn't just
stand by and watch us become walking wallets.
November 26, 2001
Adding C++ throw's clauses to...hmm 294 functions. This will take some time
to clean up. Also, looked at SourceForge and I do believe I'm going to move
the entire project over to it eventually, but not until I've got some
working clustering code...so probably March-ish. Damn, I like
CVS...update/commits across the entire project are much easier than the
source control system we use at work.
November 25, 2001
Grrr...I HATE code sweeps. I'm cleaning up the library bottom-up again,
changing assert(1 == 0)'s to throw exceptions, trying to doc all the
exceptions and putting some logging around them now. Also identifying the
classes that need test cases, and created a basic tester package. Alas,
many of the test cases will have to be externally scripted since they'll
involve multiple nodes...but we'll get there I promise.
November 22, 2001
Thanksgiving Day. Last year at this time I first got the write app up and
running, with the first pass at node code running. This time around I'm
getting ready for the public CVS release. This weekend's goal is to create
the internal test infrastructure code, my development CVS release, and start
the formal bug-squashing/function completion work. Mainly just get me
geared up for the next year of work.
November 17, 2001
Whew...I've been pushing hard on the stuff I get paid for...now the team is
ten days ahead of schedule and I'm going to get some real time for ShadowNet
in the next several days. Ok, so I'm beginning to write the test suite,
which will make sure the library sub-components work correctly (lists,
hashtables, queues, etc.) and also push a few higher-level concepts for
performance tuning. I'm also going to be writing the development guide in
preparation for Driver 2. Either driver 2 or 3 I'll put the library on a
public CVS server and begin advertising for it.
My ISP account is still expired...I'll renew it in about a month -- I'm
going to send my $$$ to Bitch magazine first :-) .
November 12, 2001
Haven't done much with ShadowNet the last couple weeks. I visited my old
team in Dallas, and pushed the clustering code stubs for a demo there but
didn't actually do the demo. So...my priorities have changed. I want to
begin making the ShadowNet library itself robust...fix all the "TODO"s
listed in the source to get recoverability and performance back. Platypus
development will occur after ShadowNet now, not before. Also, for the next
couple weeks my account at the ISP is expired until I get a check out to
them, so the ShadowNet Project pages are non-updateable.
Compiles fine on RedHat 7.1. (Looks like I'm getting more portable with
age.) I'm going to begin coding the test suite first, then start the code
sweeps. Thinking about how I want to test the library. Should I embed a
bunch of test apps and use the .properties to activate them? Should I
search for particular output strings or return codes? How many layers deep
should the testing go? Etc...
October 23, 2001
More cleanup, moved the remaining static const's out of the headers and into
the modules. AIX is still core'ing with its illegal instruction during
crypto init...recompiled with NO_CRYPTO and now it's in sn_application's
constructor. Maybe it's a virtual function thing...this is weird. Let's
see about BSD...BSD port appears to be mostly clean (except crypto, of
course). UI came up, base cluster test OK.
Ok, I'm going to wrap now and call this Driver 1. Code size: 17710.
October 21, 2001
Cute. Size of binary image without crypto: 130K. With crypto: 789K. Well,
I slept on it and decided that INST_ANY and surrogation both go away. Only
the ShadowNet nexus app would've used them anyway. Hmmph. I ditched it and
am looking at the app<-->node problem again, and see that the issue is even
more complicated than I thought. Namely: should app A that already has an
async_pipe to app B be able to issue another connect_app() request to chat
to B? If yes, then both nodes will need to be aware of the triple
when they dispatch objects...meaning I'd have to put two more ID's at the
object serialization layer. This feels like overkill though.
Ok, let's rethink. 1) I NEED a way for CM's to chat directly through the
node immediately so they can support the LOGIN/CREATE_APP/etc calls. 2) I
NEED a way for the node to know which PIPE to dispatch its deserialized
object to. 3) I NEED the object type. Now, on sn_object I've got
>. I could switch app
type to src pipe ID, and app instance to dest pipe ID. This would satisfy
#2 and #3. I would need to hard-code the first pipe ID to be the CM's, and
that would satisfy #1. Some advantages to this approach are: 1) The node
doesn't need to know about app. 2) It removes some obvious cleartext from
the data stream (the NEXUS type/instance would've been in all the object
headers). Let's see, what's bad about it? Well, it forces the cluster
manager code to be permanent. It's also going to require lots of little
code changes. And it's a rather major spec change right before Driver
1...I'd rather I had seen this coming earlier. But...I think it's the way
to go. Let me think it over laundry/ciggie now...ok, I'll do it.
Compiling now, back to the failed CREATE_APP call. CM's chat on a hardcoded
src_pipe_id/dest_pipe_id again. Just got pl_importer to connect to the
database...closer! Ugh, another state machine for pl_filereader...hmm...
I've already got a pipe from app A to app B, is there a shortcut to get C to
talk to B? Alas, not really. I need to add node_id to connect_app, but
that means I have to pass node_id back through pl_starter to the
filereaders. I gotta get some food...tofu sounds good right now.
Whew...re-did pl_filereader to use a state machine, added the
sn_cluster_node_info * to the callback_connect_app (kinda ugly exposing a
clustering object there), and debugging the CREATE_APP connect to an
existing app stuff. I must say that Platypus has been very good for
ShadowNet -- I would never have gotten this stuff in with just the VPN usage
scenarios. Wow, just saw another serializable object bing the sn_queue_byte
assertion. I think this is the _last_ thing to do before demo...
Platypus demo is "ready". Very rough around the edges, but it
sorta-clusters and does stuff now.
Code check: ShadowNet 17685 Platypus 3378 Total 21063. Neat, smaller than
last night yet more functional in the end.
So...I'm going to backport Platypus to AIX, BSD, and RedHat 7.1, then see if
I can get access to the HP-UX box at work. (Alas, I've no Solaris compilers
handy...Sun and IBM apparently believe a C++ compiler is an add-on rather
than a base OS requirement.) Once at least those two ports are in I'm going
to declare the Driver 1 cutoff and begin the long road of bugfixes to Driver
2.
Right now I think I'll shave my head and finish the rest of the tofu.
October 20, 2001
Cluster join is working, we get unique ID's now, BUT it's a star pattern
rather than a fully-interconnected network. But that should be enough for
the Platypus demo...so we move on. (Driver 2 will feature MANY bugfixes,
todo's, caught exceptions, etc.) Switched targets to Platypus, and it still
works somehow. I thought my API changes would've broken it, but they
didn't. So now I complete the login request, push the tokenizer onto a
remote node, and the demo will be basically done.
I took the bus in and out of work yesterday, and it worked rather well.
It's a half mile to the stop from my apartment, but after that it's
sit-on-my-ass, grab one transfer, sit again and dropoff by the cafeteria. I
feel like I'm in high school again, working on 8086 asm at home while I try
to get all my homework finished before 3pm so I can leave the backpack in
the locker. Heh. Also finished reading Kerouac's On The Road...would've
been easier if I was still 19 I think. I dig the Beats, but gotta wonder
why it's cool to let the Man fuck you over your whole life. (Sidenote: I
think the worklog is going to resemble the notebooks in Charles Scheffield's
"A Braver Thing".) I'm wearing the same outfit as the day I walked 19.8
miles in 13 hours, through rain and lightning. It's much looser
today...feels nice finally being on the smaller side of huge.
It's later...this is pretty cool. I re-did the Platypus launch sequence,
now I'm adding RT_LOGS and RT_STANDBY. It's a weird development cycle
tonight...add the code for process_request in cluster_manager, test, then
add the process_response code, then on to the next cluster_command. JOIN
and LOGIN are stubbed in, now doing the CREATE_APP (pretty simple). Oops,
not so simple afterall. There's a big hole now if an app gets more than one
async_pipe pointing to the same node, the node will get confused and
dispatch unseralized objects to the wrong pipe, thoroughly fucking up the
app. I'm going to have to maintain an app<-->node list to ensure they stay
1-1 to avoid this. I'm considering dropping the INST_ANY option too -- I
don't see it being all that useful with the new cluster manager.
Hmm...line check. Platypus: 3185 ShadowNet: 17951 Total: 21136
Lord only knows what Driver 2 will look like...bugfix code tends to swell a
project real fast. My contacts are dry and painful now, so I'm going to
sleep now. (Woah, cluster_manager is now the largest class (831 lines) and
is only about 30% done.)
October 18, 2001
Wrangling through the clustering code now. Fixed an out-of-syncness between
clustering and sockets (namely, the daemon finishes the callback->invoke()
with SUCCESS before the cluster manager was aware of the node, so clients
would call join_cluster() prematurely). Now I'm figuring out the cluster
communications -- not entirely trivial since one node has to be the master
to assign unique ID's, so other CMs will have to proxy some requests to it.
How will a CM know when the request was proxied vs. sent directly? Etc.
On a personal note, I suddenly acquired more car hassles last night...ran
over a retread on the highway. I hate cars. I've decided to make use of
the local busing system(s) for work and just use my two feet for the winter.
I sure wish I lived in a more "enlightened" city where subways can actually
get you somewhere. The perfect time to finish reading On The Road, I think.
October 17, 2001
Been busy with personal stuff and work lately. However, I'm now working on
the cluster manager to handle multiple nodes/runs/etc. Trying to get the
Platypus demo up. Also made a first pass at porting the pre-D1 download to
BSD. CryptLib didn't make cleanly on FreeBSD, so I put better protection
around the sn_crypto classes with the NO_CRYPTO define. Copied the codebase
to hate (Pit Labs). I'm going to keep it synced with the latest code on a
fairly regular basis (minus platypus, of course). RedHat 7.1...another
compiler, more problems.
October 7, 2001
Backported the new stuff to AIX. Now I need to figure out how to do the
free/swapped memory check (something to do with knlist()), also the console
doesn't unblock correctly. Amazingly enough it actually WORKS! Whatever
was core'ing it before is gone (I suspect it had to do with my class-level
static constants). I even got it to talk to a Linux node. Woohoo! I'm
going to put a pre-D1 tarball on the webpage for anyone out there.
October 5, 2001
Got the first pass at cluster manager in. Now when two nodes connect over
TCP, their cluster managers immediately exchange basic system metrics
(sn_cluster_node_info). Need to add support for sn_cluster_run_info, still
unsure about all the specifics on it. I'm going to run into some ID issues
across the cluster (app instance in particular). Split connect_app into
connect_app_local and connect_app_remote -- important because it's a simple
way to use cluster resources without actually joining the cluster. Getting
there slowly....
Once I get this next phase in, I'm going to finish porting the existing
codebase to AIX (at least get it to compile), then I'll need to switch
tracks and work on the Platypus prototype. I could be showing it to some
people within a month. ShadowNet: 16,993 Platypus: 2,824 Total: 19817.
My ethical dilemmas are still out there. First, I've done a lot of reading
about (real) US history and (real) current issues. Second, I've looked at
some of the items in the PATRIOT Act. Third, I've been listening to the
people around me. Fourth, I've switched sides on the kill-em-and-eat-em
issue -- two days so far eating only plants.
Let's talk about US history. Christopher Columbus came here specifically to
get wealthy, maiming and killing thousands directly and millions through
disease. The British colonists would attack natives in a way we'd call
terrorist today. John Brown was not insane, though our textbooks were
re-written to say so. The US tried to illegally assassinate Castro many
times -- at least eight confirmed via testimony. JFK almost authorized a
first strike against the USSR with nuclear weapons. I could continue, but
the point is that for those people born outside our borders, Old Glory
represents a very different thing than it does to "us." Osama bin Laden,
like John Brown, has "expanded" the boundary of global thought -- it is
indeed possible to attack the US government by killing its citizens.
For the "War on Terrorism" to be legitimate to future generations, it must
a) be legal on a global scope, b) be morally consistent, and c) address the
causes of the global terrorism movement. All of these can be achieved if
the US operates through the UN (tries criminals in the ICJ), cleans up its
foreign policy (eliminate covert ops, honor international treaties, etc.),
and goes after the Western terrorists (Operation Rescue, IRA, etc.) in
addition to the Middle Eastern terrorists. I see little evidence that the
current government would have as much forsight as say the government of 1945
that enacted the Marshall Plan. I expect that it will create many more
enemies for its citizens than it will eliminate -- such obvious incompetence
may be a sign that our experiment in governance should be concluded.
I am far from final conclusions, but my gut is telling me that ShadowNet
proper will have many uses for me besides saying "fuck off" to RIAA/MPAA --
uses that are today still legal but might not be for much longer.
Dissidents in China trying to get the word out, certain citizens in the US
trying to communicate to their relatives in other countries, etc. These
issues require deeper consideration before anything past Driver 4 gets
"unfrozen," but for those interested let's just say nothing is final.
October 2, 2001
Thanks to the pthreads FAQ for showing me how to determine the number of
available processors. Added code from BogoMips to get some idea of
processor speed -- but it's not reporting quite the same numbers as the
standalone program, probably due to compiler optimizations and such.
(Yup...non-debug compile produces very similar numbers as real BogoMips.
Need to remember to test clustering using similar-compiled targets.) Now I'm
moving some code from sn_vnos_api to sn_cluster_manager...I want to get the
basic application clustering working first, then the test suite, then some
more UI.
It's beginning to feel like a real framework now. I won't claim stability
is any better than pre-sweep since the multi-threading code is largely
untested, but I'm beginning to feel more confident in the overall shape of
the library now. Actual Driver 1 is going to ship in probably another
three-four weeks, BUT it will ship with almost all the functionality
targeted for "Planned Driver 3." Actual Driver 2 should be out before
Thanksgiving and will have all the function targeted for "Planned Driver 4."
Actual Driver 4 will have a reasonable set of test code -- I still hope
it'll be my New Year's present to everyone.
September 28, 2001
Four...classes...away...from...pre-sweep...functionality...must...sleep...
now. 16,054 lines of code (non-Platypus).
Ok, I've slept, now I'm coding again. I fixed the sn_window bugs, and also
found out how to get nice box-drawing characters on screen ala BitchX. (man
console_codes, send "\033(U" to set, "\033(B" to restore.) So, the display
comes up if desktop=enabled in the properties. Also I broke the network
test case by finishing the find_app_for_node() call...oops.
So now I get to take a nap and figure out what has higher priority this
weekend: starting a test suite, building more in the UI, or starting the
cluster manager. Actually, I think the cluster manager needs to be stubbed
in Real Soon and the test suite is critical. This thing is begging for
gprof, and I've been itching to try out the multi-threaded modes.
Naptime...zzzz....
September 12, 2001
This is a strange day. I hate it when people find ways to relate everything
in the universe to themselves, e.g. when a plane in Scotland goes down one
of my friends would be like "Yeah it sucks, I've got a friend with a
Scottish girlfriend, and I need to ask him if she's alright...". I also
hate the armchair quarterbacks who make predictions on world events that
they have absolutely no involvement with. So with that said, I've got no
disclaimer available because I'm about to do both. I suck.
The World Trade Center was destroyed yesterday, and the Pentagon was
damaged. Four domestic flights were hijacked in the US and three of them
made it to their targets. Thousands are dead, and for various reasons I
can't do anything useful to help the rest. How this relates to me is: I
need to re-evaluate the ShadowNet release plans for several reasons. Let me
say first that the ShadowNet library is very cool and general-purpose enough
that I see no need to alter the releases up to Driver 4. However, the
ShadowNet applications combined with the library are revolutionary and
represent an almost-perfect communications solution for the goddamn
terrorists. It's nice to say in principle that if X is outlawed only
outlaws will use X, but it's a bit different to see X used successfully in
the systematic killing of thousands.
(For the record, I am a bloodthirsty Texan like Dubya, and as the world is
well aware we'll kill just about anything with or without an excuse -- it
took some effort to stop myself from attacking my Indian neighbors this
morning because they physically resemble Palestinians from TV. Ok, maybe
I'm exaggerating. But not by much. Also for the record, there were two
messages from Lou (the CEO), and let me say that I wish that man was running
the country right now...I have never read such simple yet dignified
responses to such a situation. Lou's got half a million people ready to
jump in and restore critical infrastructure...I love working for this
company.)
Anyway, back to ShadowNet. Until I resolve my own ethical questions,
ShadowNet will not develop beyond Driver 4. ShadowNet Information Warfare
Edition is also on permanent suspension until the legal debates have been
settled through legislation and/or certain court decisions.
September 10, 2001
I am apparently a glutton for this shit. The UI base classes
(desktop/windows) have been updated to compile. I'm only three classes away
from full pre-sweep functionality (meaning silly ANSI UI).
I decided the final (yeah right) node->app policy: nodes will ask sn_vnos
for an app by instance only, NOT TYPE. The original purpose of the app_type
field in sn_object was to handle the broadcast situation between nodes, but
since I've already got sn_vnos::connect_app() and INST_ANY, I can give the
actual application instance determination decision to the cluster manager,
hence nodes don't need to know anything special. So...we've got the final
gluey API's in place. All I need to do now is write the cluster manager,
its protocol object, and do some unit testing... a few more weeks.
Looks like Driver 1 might not ship, by the time it's ready I'll be halfway
through the Driver 2/3 functions.
Yeah, so I'm going to grab the last beer out of the fridge and get some
sleep.
September 9, 2001
Got the crypto classes stubbed in. Still haven't figured out how to do
Diffie-Hellman with CryptLib 3.0, but I do have the basic contexts working
-- and wrapped in my own sn_ classes just in case I run into license issues
later on. Debugging some misplaced sn_thread pointers now...gee this is
tiring. Anyway, without DH I've hardcoded the derived keys (free brownies
for those who find the passphrase :-) ). Still haven't written the
encrypt/decrypt routines but they're only like 10 lines each (thanks to
CryptLib). Almost...almost...where I was in December 1998 with ye olde
main.c/server.c/crypto.c code. Only I've got 10X the codebase now...go
figure. Current stats:
ShadowNet: 15,963 lines
Platypus: 2,824 lines
[Later] Ok, time for coffee and some ciggies. We have encrypted sockets
now. Repeat: we have encrypted sockets now. (Still need to figure out DH,
but that can wait.) Only one tiny tweak left in the network layer
(node->app communication) and then I begin the test suite for Driver 1.
September 8, 2001
Whew...sn_protocol_serialize is about done minus flush(). sn_node_direct
and sn_node_multiplex have been removed, even though sn_node isn't quite
complete. I just couldn't stand seeing them anymore...ugh. I also added a
"test_node=" key to the .properties to get two nodes to automagically come
up and talk to each other. The session-key state machine is ready in
sn_node, I just need to finish the sn_crypto API and patch in the CryptLib 3
functions. Damn, this project has turned around since the code sweep. The
bugs are much easier to crush now too...I love me my gdb.
September 7, 2001
Let's talk about how much of a loser I am. Yesterday I slept for almost 16
hours of the day. Today I pigged out on doughnuts and Chik-Fil-A. And
right now I'm sitting in a freaking bookstore on a Friday night. "What-evuh
chigga-head."
Ok, more seriously... Cleaned up the nodes considerably, moved the object
serialization code to a new class (sn_protocol_serialize) and I'll soon have
sn_node_direct and sn_node_multiplex thrown away. So the socket now creates
a sn_async_pipe and sn_protocol_serialize upon accept() or successful
connect(), and then a node and spins the node up on a new thread. Much
simpler in code and makes more logical sense. So now the remaining object
dispatching goes into sn_node itself, then I implement a new flush() on
sn_protocol (to replace the dry_runs stuff in sn_node_multiplex) and I'll
have object serialization wrapped up. That'll be about 1/3 of the work
before Driver 1...the automated test suite being the remainder. I'll
probably pull the CryptLib 3 support into D1 and shove the ANSI UI into D2,
since UI is less important to me. And further on the horizon
is...clustering. I sure do like this "getting paid to write code I like"
thing, I'll have to do more of it in the future. :-)
September 5, 2001
Still porting to AIX. Now it compiles AND links. I had to comment out the
references to sn_log_entry() in the global object constructors to avoid the
"class initialization order" problem...that was causing LOCK to throw an
exception since sn_logger_multiple hadn't been constructed to call
MUTEX_INIT. Now the problem is an illegal instruction in new pl_starter().
Well, for now I'm going to switch modes and continue the Driver 1 sweep...I
need to get the network daemon living again soon.
The TCP daemon lives again! Found a silly bug in sn_socket_tcp that would
explain why I wasn't catching disconnects before. I also figured out what
was bothering me about the nodes. I was trying to do too much with them.
node_multiplex has got to GO. I wanted to enforce a layer between
applications and raw I/O, which is good, but I've now got that via
async_pipe, so nodes are now representative only of other physical ShadowNet
nodes and they return to their original purpose of a) handling link-layer
crypto key negotiation and b) 1-N cardinality between sockets and ShadowNet
nodes (since one UDP socket can talk to many nodes). I'm going to add
chaining support for protocols and async_pipes, and then the socket-->app
path will be very simple to implement in code: socket->protocol_crypto->
protocol_direct->async_pipe->node->->app. So apps still
see everything as just async_pipe's, no difference between a file-based
object deserializer (hehe, handy for SDFS) and a different node. ("This
abstraction was brought to you by...the new and improved Brillo Cluster
Manager!") Kinda schweet, the VNOS and daemon both look at the .properties
to figure out the ports/etc, so I can dump a bunch of getters/setters
between the layers.
Ok, I'm hungry and tired now. It'll be a few days to wrap the AIX port and
re-build the object serialization layer, then I'll do a code drop for Driver
1. (I'm still a bit embarrassed that Driver 0 was only 12,000 lines, and
would only compile on (some) linux's, and had no network daemon, and ... you
get the idea. But I had to finally get my ass in gear somehow, and
humiliation is a good motivator sometimes...)
September 4, 2001
Began the AIX port today. Blech...the linker had lots of errors with my
static class variables, so I did as the FAQ suggested and split the
initializers from the declarations. Not sure if I'll get the "class
initializer order" problem or not. We compile now. Had to create new
targets for libplatypus/libsn since AIX ld has to be wrapped by
makeC++SharedLib (grrr, Makefile is getting messy again).
AIX ODM/cfg behaves differently for getattr, sn_socket_console will need
adjusting. Mutex locks just failed with an EINVAL. Looks like
sn_logger_multiple was never constructed. I'm beginning to see why most
projects out there are in C rather than C++ -- this is stupid.
September 3, 2001
Work took most of this month from ShadowNet. We're about to release there,
and happily I can now say that ShadowNet Driver 0 is out! I did some
function cuts for this, and didn't get around to any of the non-linux ports,
but I also made a roadmap of Drivers 1-16 (13 is official beta, 16 is 1.0
release) so I'll be able to concentrate on actually building this thing.
Driver 0 only features 12,000 lines of code, I didn't even bother putting
the nodes or UI classes out (they've got the wrong license header anyway).
Driver 1 will complete the original "pre-release" functions I wanted to ship
by now, but will be much stabler and I promise I'll have the AIX, Solaris,
and BSD ports ready by then. I'll do an actual CVS tree with announce on
Driver 4 -- that'll wrap the cluster manager and we'll be on the ShadowNet
applications proper.
Point is...to any developers out there: some of the functions in Driver 0
are cool, but don't bother toying with code until Driver 2 is finished.
Then you'll have a reasonable multi-threaded "kernel" with network support
and link-level crypto. By Driver 4 you'll essentially have the entire
ShadowNet application framework. And I'm hoping that will be ready for
everyone by New Year's. Then this pitifully small 12,000 lines will be
closer to 25,000.
Ok, now for an actual code update. I'm not happy with the nodes, hence I'm
delaying the object serialization support to Driver 1. Everytime I open the
files I get that "I want to get out of here ASAP" feeling which tells me I
don't have the right design. I'm going to add easy chaining to protocols,
which will simplify immensely sn_node_multiplex, and I want more practice
with sn_async_pipe before I settle on a permanent API. Let's
see...sn_socket_file is broken (I'm using one buffer for both read's and
write's), sn_socket_tcp is being split into sn_socket_tcp and
sn_socket_tcp_server (which will have the accept() call). The split is to
make a synchronous client easier to write later. I added a copyright
statement to the Makefile. Ripped the Platypus launch sequence for the
router. Hmm...a few other things I can't remember right now. So now I'll
finish updating the web page and begin the AIX port...hehehe.
August 12, 2001
Code sweep at 60%. Going through the sockets layer now. Replaced the
map<>'s. Sidenote: I saw a lot of new C++ books at the store this weekend
focused on "professional" coders with lots and lots of templates, generic
classes, etc. emphasis. I've decided that templates are cool for small
things, but for real systems where performance is an issue I just don't see
much value in "reusable code". I think Java made the best answer there is
to the idea of code reuse: make a standard library and force people to
adhere to it, and that library is the only code reused. Seriously, this
holy grail of reusable code just doesn't seem to be buying us much as an
industry. Anyway, back on topic... I'm going to begin porting the VNOS
core (the new name for sn_application_manager) to BSD-ish systems this week.
This will constitute a "mini-pre-release" sort of in that it'll be the first
time anyone else has actually seen the code. I'll try to get the sockets
wrapped ASAP for that effort, but I'm thinking of just leaving the UI alone
for now...it'll be neat in the end but for now I'd rather focus on the
network and crypto state machines.
August 8, 2001
Code sweep at 50%. We compile again! Still have a few list<>'s to rip out,
and the map<>'s go too, but things are looking good. Amazingly few
unexpected behaviors so far...keeping my fingers crossed. Here's what I
find kick-ass, though: prior to this sweep, the compiled debug version with
STL list<>'s and such was 4 megs stripped to 1 meg. Now it's 1.2 megs
stripped to 120K. Yes, I've got a multi-threaded environment that
has a binary footprint of 120 kilobytes. Add the network, UI, and crypto
back in and it shouldn't be more than 300K. The overhead of load, init, see
nothing to do, and exit was prior-to-sweep under 2 seconds -- I'd imagine
it's close to 0.5 seconds now.
August 6, 2001
Code sweep is roughly 35% complete. I've got the base classes cleaned up
considerably, a consistent CVS-ready header on most things, and the library
compilable again. I starred the items below. Still need to update Platypus
to use the new sn_vnos (formerly appmanager) API's, then I'll sweep the
network layer (ugh) and the ANSI UI.
So the plan is:
1) Finish the sweep (1-4 weeks)
2) Pre-release onto a public CVS server, initially change-locked
3) A lengthy debug cycle, testing especially multiple threads/low memory/high
load/etc. (1-4 weeks)
4) Link-level crypto (update to CryptLib 3.x) (2 weeks)
5) Announce the pre-release on Usenet
6) Add cluster management
7) Open the CVS tree to community development
8) Platypus (no public release) and ShadowNet (6-12 months)
July 8, 2001
Hmm, five months. A lot and a little has happened in five months. First,
ShadowNet proper has been split into three projects. The "ShadowNet
Library" (libsn.a) provides the application framework (threads, database
support, clustering, etc). "ShadowNet" is the set of applications that will
likely send me to jail. "Platypus" is an internal project for the company
only whose purpose is to push the ShadowNet Library to the limit of
enterprise computing. So given this split, here is my status...
ShadowNet Library has about 65% required functionality now. The major holes
are clustering and crypto, both areas that require lots of state machines at
node and app layers to run. The API's are there, they just cheat. This
piece is about 14,000 lines now and needs a complete code sweep badly. The
current todo items on that sweep are:
Code sweep:
Organize sn library
* - base (internal only, logger/memory/exception/etc.)
* - lib (base + global pointers to appmanager, signal handlers, thread, application, etc)
* - application (lib + sql)
Fix sn_application subclasses to pass correct apptype in constructor,
no need for callers to know about the sn_application_list constants.
* Unify copyright logos
* module function/author/revision #/etc
Fixup conventions
Replace any long's with time_t (if there are any)
* parameter names
local variable names
* return types (switch int's to bool's in daemons)
* headers
* indentation/log parm spacing/etc
* Add log_entry to all functions
* Multiple log debug levels (DEBUG, DEBUG2, DEBUG3, ENTRY)
Fix sn_propertylist to handle multiple identical keys (e.g. many items specified)
get_property_int should return int, not int *
is_valid() checks in objects
Fixup exception handling
* strdup --> sn_memory
log_err levels for common conditions:
coding errors -> EMERGENCY (where assert cannot be used)
* Wrap potential error libC calls
* Replace STL list<>
* Replace STL map<>
Test suite
Multi-threaded mode
Object serialize/deserialize
Synchronous client-only TCP support (not in the daemon class tree)
Move most of daemon_tcp to it? Daemon_tcp just needs to hang onto
one more socket (listener) and handle accept(), otherwise
all is the same code.
This code sweep is a lot of work -- 128 source files. It's very dense code
too: these 14,000 lines would normally be 40,000 in commercial software
(speaking from experience BTW). So if I had 40 hrs/week on it I'd size it
at 4 weeks, but I've got more like 10 hrs/week. So just for those people
who might be aware of the project and just want to see some code, I can
finally see the light at the end and will pre-release the library into the
Public Domain once this sweep is done. However, the code will still be
buggy as hell. I've got a HUGE debugging effort ahead that will probably
last several months, so I'll be updating lots of files after the
pre-release.
The ShadowNet applications have not been touched at all this year. They
depend on the library, and the library won't really be finished until
Platypus is in commercial use, so they'll be waiting for quite a while.
When they are finally done it's likely the competing P2P projects will be in
full force in the market, so I'll have a lot of ground to catch up on.
Finally, the Platypus project has really started. It launches in a
synchronous fashion, and has access to all the library components it needs,
and gets some real work done. This is good news for ShadowNet, because
Platypus has very serious performance requirements (I'm targeting at least
6X faster than its best competitor).
So that's the status of the three big projects. On a slightly less
technical note now, I'd like to outline some of the implications of these
projects.
The library started as an asynchronous network daemon, but has grown into
essentially a C++ version of Jxta and EJB. In other words the library will
compete against J2EE and .Net. It will also be MUCH faster than either,
since I'm building ground-up only what I need. Any coder out there will be
able to trim and customize it into their own app that will a) execute
faster, b) on cheaper hardware, c) in C++ (Java sucks), d) for $0. So I
don't forsee much industry support.
Some people have suggested that I seek funding and sell this. Stepping on
my soapbox for a moment, let me explain why I won't. This industry employs
hordes of unskilled programmers who crank out crap and sell it for
exhorbitant sums. It starts from the top: academics who don't know their
way around a process list push OOP (via Java), CASE tools, UML, XML, and
other worthless gizmos into the industry; executives hear the acronyms and
force their project leaders to adopt them; kids out of school are hired only
if they know the acronyms; the next generation of very pretty yet damn
unstable crap hits the market and is priced at what the market will bear;
the financial sector buys it because they don't know shit about computers.
And the cycle continues, more acronyms are created and more "breadth of
knowledge" is required, yet the final product quality continues to fall. I
refuse to let this project participate in that cycle. I'm cutting my losses
as early as possible. I need code to DO things, not to sound impressive.
So now I'll step down and get to it again.
February 18, 2001
Just wrapped the final spec on the ShadowNet core...the async app connect
request. I begin writing the actual Platypus apps now, initially testing on
single-node environment without transport or serialized objects, but eventually
moving to clustered. I'm glad I came from this direction though -- absolutely
no way would I have been able to design top-down with the kind of
high-performance async I envision. That's the Java EJB flaw I think -- 99% of
the functions have to be synchronous somewhere, which immediately creates an
unnecessary bottleneck. Thanks to Sun for not supporting non-blocking IO.
Feh. Anyway, I'm moving along, really wish I had some Net access from home,
though. Grrr....
The next major phase is (ironically enough) deterministic program execution.
Meaning, processing node is already up, we get a "starter" node to come up,
contact it, and then begin feeding data in. Processing node notifies starter
when it's finished, and everybody then exit()'s. This is critical as it allows
existing application-based scheduling to control a Platypus cluster. Gotta
begin splitting apps up in the Makefiles though, and suck some of the ShadowNet
"application" stuff into the library proper (e.g. crypto).
Codebase currently 12,788 lines.
February 11, 2001
Cleaned up disconnect (connection list window was being stupid), now working on
asynchronous application pipe request. Added app instance ID to objects, need
to fix node_multiplex to find correct app instance rather than just the app
type.
Codebase now 11,800+ lines. Amazing how much it can do, though. I estimate
less than 3000 more before I get to distributed cluster/pipeline manager, which
leads straight into Platypus. That's essentially a C++ equivalent to session
EJB + workload management in a small fraction of the code size. Of course I've
thrown away the 20% functionality in EJB that accounts for 80% extra code
because I don't want to be absolutely tied to a database.
February 10, 2001
Lost the coding bug for a couple months while settling in to a new home and the
same life. Had to go through some Meaning Of Life bullshit, now I'm clear and
moving along again. Pushed the ship date to 1Q 2002, since Platypus must be
completed first and I've got a major amount of work-I-get-paid-for to do. This
just means ShadowNet will be much more hardened and robust, with some kind of
reasonable test environment ready for it by the time it ships. I'm not too
worried about missing my market, though -- this thing IS the next paradigm
shift.
Anyway, it'll take some days to get my head back into this. I forget how dense
this code really is sometimes. But I'm back in for the next round.
WRITE LIVES! Finally I have broadcast support for applications! Found a few
neat bugs in node_multiplex's run->write code, cleaned that up. FSM in read
seems to be working, now cleaning up the socket/node/app disconnect.
December 29, 2000
Got the basic UI debugged, now wrapping the sn_node_multiplex. I need to add
padding at the object serialization layer so that the underlying crypto layer
will dispatch in a timely manner. Also figured out that sn_write is easier
than most apps I'll need because it can operate in a broadcast manner, but for
clustering/pipelining I'll need to add application instance id's to make the
broadcasts behave more like a real 1-1 channel between apps. Won't be too
difficult, minor tweaks to sn_object_base and a subclass off runnable_queue to
hold the app ID.
Looks like I may not make the ship date, because I realize that I want to have
cluster management in the initial release (take THAT Sun Microsystems!). That
would make interfacing ShadowNet with things like MojoNation a bit more
enticing, since by simply running a ShadowNet node on your peer box and
permitting Mojo to use you in its cluster, you get access to everyone else's
CPU in the cluster (assuming your apps are parallelizable/pipelineable, which
nearly everything is). The Platypus project will also need it, so if I do that
one first I'll have excercised the clustering extensively before the ShadowNet
release.
The irony in all this is as I generalize things further and further the
specific "tricks" for ShadowNet become simpler to forsee and do. When this
releases there will basically be two ways of computing: inside and outside. If
you compute on the outside, you get to run OS-level processes, define your own
network protocols, etc., just direct your traffic to "X.domain.shadownet" and
your traffic by itself is shielded by the ShadowNet transport layer. Or
re-code your app so that it can be initialized and run from within the
ShadowNet application manager, and POOF you have transparent access to everyone
else's CPU/disk/etc resources just by doing a
"appmanager->connect_app(callback_apppipe_request)" and pushing your data to
the resulting queue. Attach a console app to any node, request a UI on that
node's cluster manager, and see everything there is to know about your
distributed app.
One of my goals in the original design doc for this project was to make it
possible to utilize 100% CPU of all connected boxes to do productive things. I
think this design will get me there.
Codebase currently 11,470 lines. Only 6 lines containing #ifdef anywhere --
4 are #ifdef DEBUG, 2 are #ifdef SOLARIS.
December 25, 2000
Debugging the UI classes now. Those last few percent of functionality are
taking a lot of work and tweaking to get right. However, I've _almost_ got the
dialog/message box in place, I/O between parent and children classes, and the
UI event model worked out. The state switches in sn_window::input() are much
nicer than the old sn_application::process_windows() idea. Got the desktop
manager pushed out as a "normal" application now, also, so the special
exceptions are going away. Just need to ditch
sn_application_manager::application_manager_run in favor of a private exit()
call and it'll truly be generalized.
When this is finished sn_write() will be quite trivial. I think I'll do sn_ftp
after that, then start working on the packet driver before beginning the
ShadowNet classes proper. I figured out a _very_ simple interface to the
packet driver that will be nearly perfect for my use. Once that's up I'll have
basic tunnelling almost immediately (can we say traditional VPN?). Add the
ShadowNet classes and I'll be able to develop the dynamic virtual circuit-based
VPN, and just expose it to the apps as a normal old sn_node *. Of course I
could get channel-bonding in place even before that. Essentially, within a
couple weeks I'll have climbed out of the valley and apps will begin opening
up _very_ quickly.
It's a bit later, just got the connection list working to my satisfaction.
Need to enhance the dialog box to handle OK/ABORT's - I'd like to support
"Yes/No", "Ok/Cancel", and simply "Ok". Rather cool right now, though -- you
can connect back to the same instance and see both sides of the connection in
the connection window. So close...I want to go ahead and make an application
list window (just like connection list, supplied by the appmanager via
sn_application_list), so the user just generally makes a connection and then
attaches an app to that connection. Poof, instant write/ftp/etc. Once write
is done, I'll get the remote console working, then modify the logger to log to
_all_ console windows, then finally I'll be able to fully test the backend in
headless mode.
Codebase now over 11,000 lines.
December 18, 2000
It's funny how I'm almost to the point I had wanted to be two years ago:
capable of connecting two nodes. Every minor subsystem has become a battle to
find the one method I'll want to tinker with least in the future. I did a line
count at work today -- on the most recent project I've written 8,000 lines
there that are relatively speaking worth 1/10 of what ShadowNet can currently
do. These battles are difficult but worth it: competitors working in this
space would likely require 2-3x the codebase to accomplish these functions, and
they'd forever be unable to reach that last 15-20% I'm going for.
I've just wrapped the mental work on the window-app interaction. Essentially
it will be as follows:
sn_window will be subclassed into sn_window_application, which will have an id,
title (to look nice for the user), a list of child windows (subclassed from
sn_window_application containing only an additional parent id), and functions
to: read user input, (optionally) create output, and find the currently-focused
window (recursive call through child windows). sn_desktop_manager will be
modified to call the user input function on the currently-focused window on
every input event. The parent window will need to give its child window a
pointer to a callback state structure that contains whatever processed data the
child needs to return. Also, sn_application will be modified to have
sn_window_application a friend class, so that the windows can access their app
data. The open_window() function in sn_application will become get_window() so
the app can decide what top-level window to give the desktop manager. The end
result will be that the app and its windows will exist on separate threads,
windows being executed by the desktop manager thread. (I see little point in
multi-threading the windows, since they'll be moving at human speed anyway.)
There will be some locking issues the windows will need to resolve as they
modify the app's state, possibly should have them check for state==RUNNING, and
have the apps lock and read their state once only at the beginning of run(). I
thought of putting a message queue in the application window class ala Win32,
but don't see much point in it since windowing is all one thread -- events
should be processed in the precise order of generation anyway. So far nothing
in the design prevents other kinds of events from running, like mouse clicks.
Also, the parent-child-focus stuff will allow more complex windows like menus
to eventually be written. Running asynchronously from the app should make the
app's life a hell of a lot easier -- no more pushing data out to progress bars
and such, let the bar check the app directly and do its own calculation. I
also pave the way for drop-in replacement interfaces ala "skins" (skins in
ANSI...hehe). Finally, I can easily create a headless version of the entire
system this way.
Ok, now I'm going to sleep and code this up later. BTW I did a re-write of
COPYRIGHT.txt yesterday. I'm curious to see what kind of trouble it will stir
up on GA day.
December 17, 2000
Whew! Finally got the window into a mostly-working state, with border and
background color. I can put a window anywhere on the screen, and the printf()
family accepts relative client coordinates now. Got truncation fixed.
Discovered that Linux ANSI console counts starting from (1,1), not (0,0), so I
added another relative (x,y) start point (this is handy because when I render
another desktop within a window I can set that desktop's relative (x,y) to the
coordinate _I_ need on my local screen, and it'll all pass through OK). Now to
make the window state wrapper. Need to figure out how to hide the blinking
console cursor. Also see several cute bugs with node/socket/thread interaction
-- failed connects are creating nodes that don't die as expected. Oh well.
Also want to create a sn_logger_window, so that I can see the debug output
while the app is up without bleeding over everything else.
Codebase up to 9,900 lines. Almost to the 1st milestone.
[Later that day]
Got the sn_logger_window class working, discovered a neat class of bugs
surrounding the "just display the last changed line" code. Also added
primitive (I mean, _primitive_) scrolling support (hint: don't use
printf(x, y, ...) on a scrollable window, it may take a long-ass time to render
your line).
It's beginning to look purty. Like a real application. It's got two windows
now and a basic menu. I've got to put some thought on the window handler
utility classes before I proceed. I want to provide simple modal
message/dialogs/menus, so that will require a window tree under each
application, local output vs. global input focus, etc. etc.
Writing a general-purpose abstraction of an operating system for ANSI tty
terminals just so I can put up some flashy menus and multitask is becoming a
rather large job. I still don't have my apps yet! (But I'm loving every
success...there won't be much of softwaredom not at least touched on somewhere
by this project.)
Just broke the 10K mark: Codebase now 10,093 lines.
December 16, 2000
Added the attach_application() call to appmanager, this is essentially the
glue that puts applications and nodes together. Multiplexing nodes will
use this to find applications for new data, the UI will use it to attach
apps to existing network connections. It provides a cute and simple way
for apps to decide for themselves whether they want to be 1-1 or 1-n.
(Example: write is a 1-1, since it only connects 2 consoles. ftp will also
be 1-1. The platypus visit tokenizer will be 1-n.)
Been re-thinking the source release. Discovered that the ultimate purpose
of this project is really for myself and not the world as a whole. Thinking
it may be better just to let the rest of the world do whatever it will, this
code is for ME in the end. That path leads to removing the advertising and
just public-domaining the whole codebase. That would be the proper academic
way to do it. The whole copyright/licensing thing makes me want to puke.
I really don't care when Microsoft pollutes my network standards, really --
I just won't use their software. Finally, if I _do_ end up policing a
licensing model, I just end up helping people who are unable or unwilling to
code for themselves, my "allies" in the market. The only real "allies" I
need are other coders whose competence I can trust, and I figure if anyone
like that is interested in this project they'll help for their own reasons.
So far I'm leaning 75% that way -- I'll need a lot more convincing to put
up a license.
Codebase now about 9,600 lines.
December 10, 2000
Wrapping the multiplexing network code now. Got lots of little things to do to
get the write application up, but very very close to it now. Connections work,
crypto interface is in place (currently just passes raw data through though),
cleaned up the TCP sockets. The queue concept has turned this project around
in a big way -- it provides a very nice way for runnable's to know when they
get disconnected from each other. Finding it difficult to code on the
weekends, unfortunately -- the diet sucks a lot out of brain energy out of me.
Oh well. At least now I'm on the high-level end of it. In a very short time
I'll do a limited source release to a handful of trusted people. Just want to
wrap the underlying architecture first.
On another note, Platypus has been keeping me up at night. I've figured out
how to do the aggregates real-time now, and a very simple way to displace the
sub-applications across different CPU's (not required for ShadowNet since
ShadowNet uses connectionless applications). I'm going to have to put some
basic work in it soon. Made the decision to support the SurfAid Sydney schema
directly. Don't see much point in porting existing SurfAid to Perl or Linux or
anything since Platypus will be so much faster anyway.
Codebase now about 9,200 lines.
December 2, 2000
Got snarled in a few thread locking issues -- had to establish some
rules for when to use LOCK/UNLOCK because I discovered in single-thread mode
that an application can alter the state of its thread (via appmanager->add_thread()).
So the rules are: 1) you must LOCK before reading/writing class variables, BUT
2) you must UNLOCK before executing _any_ sub-functions.
Another whole bag of snarls around the connect() call. Turns out connect() can
fail in two different places when in non-blocking mode, and I've got to be very
careful with the socklist in sn_daemon_tcp. Also found a stupid 1-byte mistake
that fucked up clean_list(). Urrr, hate that. Got it done, though, so now
we're focused on sn_node_network (now renamed to sn_node_multiplex). May take
a couple days to hammer out, this one's not so trivial as sn_node_direct and
it'll require some new API calls in appmanager (to find/create applications).
On the bright side, when multiplex is finished crypto1 will be very easy, then
we'll be done with the core. (Except for the huge end-to-end debugging effort
of course.) Glad I've got a few more months...I'm getting a bit tired and the
core is just a hair more complicated than I had hoped for. We do what we have
to to meet customer requirements I guess -- especially when the customer is me.
An example of complex code: "(*((*c)->get_callback_function()))(*c);"
[ get (deref c)'s callback function and call it with (deref c) as the argument ]
Codebase now 8,600+ lines.
I'd also like to begin writing about some of the implications of this project,
where it came from, possible places it go, etc. I have the distinct feeling
that I'll need some ready answers when it releases.
November 22-25, 2000
Whew! I am reminded of the story in 5.3 of the Tao of Programming. In the
last three days I have completed about 85% of the core OS functionality,
including (last night and today) a virtual desktop + window manager.
Oooo goodie. :)
Finally fixed the damn Makefile to use makedepend, put the core stuff into
an actual library, fixed up the console, created the desktop manager,
ANSI "windows" (more like floating overlays, but will fix them up more later),
found some major bugs in the TCP sockets (reads and writes will step on each
other). Finished sn_protocol_direct, made a bunch of internal app rules,
fixed up the thread model, still debugging in single-thread mode though. Need
to find a good multi-threading debugger that can handle LinuxThreads --
gdb gets lost on pthread_create().
Summary then: we've got abstracted socket I/O (console, file, TCP), daemon,
applications, serializable objects, byte queues, object queues, support for
multiple desktops (only one permitted on the console), simple keyboard support
for applications (kbhit()/get_char()/printf()), ANSI color,
{raw data} ==> {object} conversion, marginally-stable multi-threadedness, and
Parberry-style "dprintf()" logging (neat trick -- debug printf's don't even
compile into the final release).
For the rest of this weekend I want to add an asynchronous connect() call to
the appmanager and create the FTP application. Haven't looked in detail at
the HS/Link code, but my gut is I won't need much of it afterall. The async
call will require a new callback interface for sn_application, and the FTP
should let me test the connect, file I/O, and network I/O all at once.
We're almost through the "Virtual OS" portion, soon we'll be working to add
the "Network" to that. Still thinking about when to release the source --
would like to do so early enough to get comments on the OS core, but don't
want the vision corrupted before the real kahuna comes out. Gut tells me
to keep it closed all the way to the 0.99alpha release. Sometime next
quarter I'll have to start dividing the wish list into V1 and V1.1 features.
Source code now 8,200 lines, still a baby but growing rapidly now.
------------------------------------------------------------------------------
-- Everything below this line is crap! ---------------------------------------
------------------------------------------------------------------------------
July 16, 2000
Added support for multiple sn_console objects (only one is allowed on the
actual console at any given time).
July 15, 2000
Added native thread support, moved network layer off to separate native thread.
July 10, 2000
Implemented thread model. Renamed sn_server sn_application_manager, combined
with sn_server_ip.
Todo:
sn_socket_tty, sn_protocol, sn_protocol_tcp.
Shift console from sn_console to sn_socket_tty.
Add native threads to sn_thread.
July 4, 2000
Created ShadowNet Project Homepage.
Immediate todo:
Nail down application-layer and protocol interface specs.
Create sn_socket_tty, sn_socket_file
July 3, 2000
Project compiles, currently has the following classes:
Network virtual layer:
sn_node
Network physical layer:
sn_socket
sn_socket_tcp
sn_daemon
sn_daemon_tcp
sn_server
sn_server_ip
Core:
sn_logger
sn_logger_tty
sn_memory
sn_exception
sn_signal
UI:
sn_ui_cli
sn_console
router
Crypto:
sn_crypto