You are currently browsing the category archive for the ‘Uncategorized’ category.
In this video from Learning Web Development with Seaside we start on a simple application that will be used for a few more lessons. We create a domain model object for ‘FlightInfo’ that holds a date/time for the flight and a price for the flight. We then create a component to display the model object. For the tutorial PDF see http://seaside.gemstone.com/tutorial.html.
I’ve completed editing and posting the videos from the ‘Scaling Objects for the Enterprise’ presentation at Smalltalks 2009 in Buenos Aires. This post provides a summary list of links:
- What is GemStone?
- Object format and pointers
- Special Objects and Header
- Large Objects
- Repository, Extents, and Object Table
- Component Overview
- Shared Page Cache
- How Gems Create an Object
- How Gems Read an Object
- How Gems Modify an Object
- How A Gem Dereferences an Object
- Commit Record Backlog
- Handling Commit Record Backlog
- Commit Process
- Issues in Concurrency
- Lock Granularity
- Reduced Conflict Classes
- The Stone Process
- Live and Dead Objects
- Nine Steps of Repository-Wide Garbage Collection
- Epoch GC, Offline GC, Multi-machine
- Garbage Collection Q & A
If you cannot get access to YouTube (e.g., due to corporate web policies), you can get the QuickTime files directly from here.
This video records few questions and answers. How effective is the mark-sweep? Which objects will be found? Does garbage collection reduce the size of the extent? How can you reduce the size of the repository? How do you manage a “warm” standby for quick recovery?
This is the final video from the GemStone/S tutorial at the Smalltalks 2009 conference in Buenos Aires.
This short video gives a brief description three topics.
“Epoch” garbage collection is a variation on repository-wide garbage collection in which only objects created or modified during a specified period (“epoch”) are scanned. If young objects are more likely to be dereferenced, then this can be much more efficient than a full database scan.
“Off-Line” garbage collection is a variation on repository-wide garbage collection in which the most time- and resource-consuming steps are done on a backup of the database, with intermediate data structures imported into the main database to complete the process.
Multi-machine configuration allows virtual machines (“Gems”) to run on a host separate from the Stone host. This requires a shared page cache (SPC) on the remote machine, and various extra processes on both hosts. While more complex, it does allow significant scaling—with one customer using hundreds of machines all connected to a single database.
A frequent area of confusion in GemStone/S is the use of ports for the NetLDI when starting a remote gem. The Gem startup process is somewhat complex; describing it takes an hour or so of the Advanced Configuration class. I’ll see if I can condense it…
- A GCI client (Topaz, GemTools, Jade, VW/VA, etc.) contacts the NetLDI on a host/port. The port can be specified by number (e.g., 50377) or by name (e.g., gs64ldi) in which case it should be in the client and server services file with the same name/number. This port is defined implicitly by the name provided to the startnetldi command. If no name is provided, then the default of ‘gs64ldi’ is assumed. This port is unrelated to the port range provided as an optional argument (described later).
- The GCI client requests a service from the NetLDI; generally this is a new Gem using the script ‘gemnetobject’. The NetLDI puts the GCI client “on hold” while it starts up the new Gem. This conversation resumes at step #8.
- The NetLDI listens on a new random port for a callback (see #5).
- The NetLDI starts the Gem and passes it the port number from #3.
- The Gem calls the NetLDI on the port from #3 to request further instructions.
- The NetLDI tells the Gem that a GCI client wishes to make contact and that it should listen for the GCI client. If a port range is specified, then the NetLDI will select a port from the range and tell the new Gem to listen on that port. If no port range is specified then the Gem will allow the OS to pick a port at random.
- Using the connection established in #5, the Gem tells the NetLDI which port it is listening on (from #6). The connection from #5 is then closed.
- The NetLDI returns to the GCI client (using the connection established in #1 and put “on hold” in #2) and tells it what port to use to contact the Gem (the port number from #6). The connection between the GCI client and the NetLDI is then closed.
- The GCI client initiates a call to the Gem on the same host as the NetLDI but on a different port–the port on which the Gem is listening. That port number will be one selected by the NetLDI if a port range is specified or one selected by the OS if no port range is specified.
- The Gem accepts the connection from the GCI client and the GCI client provides login information (Stone NRS, userID, password, etc.).
Note that there are three (3) connections involved: (1) from the GCI client to the NetLDI; (2) from the Gem to the NetLDI; and, (3) from the GCI client to the Gem. The first connection is on the official NetLDI port that should be in the services file for each machine (gs64ldi/50377). The second connection is on a random port and there are no naming issues because it is completely self-contained in one host. The third connection is the one that can be random or can be limited to a particular range (using the -p50378:50378 argument to the startnetldi command). If there is a firewall on the server, then the ports for connections 1 and 3 need to be open to calls initiated from outside the server.
There are additional complexities. If the GCI client and the NetLDI are on the same host, then the named NetLDI does not need to be in the services file. Also, in steps 9/10 there are actually two connections, a primary one and an “out-of-band” connection. Furthermore, I could have gotten some of this wrong, but I think there is enough right to explain the difference between the number in the services file and the number in the port range.
This video starts with a description of “Shadow Objects,” objects for which a newer version has been saved in a later transaction. Then we move to the nine steps of repository-wide garbage collection:
- Mark/sweep (to identify live objects);
- Object table sweep (to create a list of possible dead);
- Record possible dead;
- Voting on possible dead by current gems;
- Finalize voting (on behalf of recent commits);
- Read possible dead objects (cleanup);
- Promote to dead (transition);
- Dead object reclamation; and
- Return of pages and object IDs to the free pool.
Understanding these steps and being able to identify where we are in the process is very important to managing a production database since GC may require significant resources.
Because GemStone Smalltalk does not have a native GUI, most developers use a “client” Smalltalk with some tools that give access to the “server” Smalltalk. Traditionally, this has been GemBuilder for Smalltalk (GBS) installed into a VA Smalltalk or VisualWorks client. With the advent of GLASS (and a no-cost license for GemStone/S 64 Bit), tools are also available for Squeak/Pharo (named GemTools).
Because it is non-trivial to set up a local Squeak/Pharo system, we have provided a pre-build “one-click” download that works on Linux, Macintosh, and Microsoft Windows. Of course, once something like this is made, it is immediately out-of-date and it becomes a challenge to provide updates. Updates to GemTools itself, while significant, are at least confined to executing Smalltalk code in the client. Updates to the client environment are, however, more complex. Also, there are those who want to build things from scratch so as to better understand what is going on.
A further complication is that once you a working environment, you are less likely to rebuild it from scratch. This leaves the frustrating experience that “it works for me” and “when I followed the instructions I get a different result.” In our case, the instructions refer to third-parties (the Squeak/Pharo web sites) and simply direct you to follow their download and install instructions. When those instructions change—or the things being installed change—the results are likely to change (and not for the better!). It was into this pit that Friedrich fell a few days ago. He faithfully followed the instructions, but got a different result from those of us who followed the instructions sometime earlier.
Given that Squeak/Pharo is an open-source project and is rapidly developing (and diverging?), keeping GemTools up-to-date is a major challenge. One of the biggest challenges is related to the external library that is used on the client to interact with the server. The name and location of the library seems to be different for each platform and VM. Debugging these problems is particularly challenging, since one gets little more than the error “Unable to find function address.”
As of the date of this blog post, the Pharo download page provides two sources for a client virtual machine. One takes you to a place to get builds of Squeak-188.8.131.525 for various operating systems, including Linux; the other immediately downloads pharo-vm-0.15.2f-linux.zip. If you take the “Pharo” VM, things work; if you take Squeak-184.108.40.2065, then things don’t work. It turns out that the latest Squeak VM has a change in how external libraries are referenced.
When the new VM was released, Ian Piumarta included a release note:
Plugin search stretegy rationalised and simplified. Default location is now the executable directory (where ‘squeakvm’ is installed). The -plugins argument can be a colon-separated list of locations to search, just like PATH. Plugins are named ‘so.plugin’ to make interference with FFI libraries less likely.
Thus, we need to change not just the location of the library, but also its name. Instead of putting ‘gciForLinux.so’ next to the image, you now need to put ‘so.gciForLinux’ next to the virtual machine. Furthermore, when opening the library you need to leave off the ‘so.’ prefix.
So, if you are using Squeak-220.127.116.115, put ‘so.gciForLinux’ next to the VM, and edit GciLibrary>>#’moduleName’ as follows:
| path |
self isOnWindows ifTrue: [^'gciForWindows.dll'].
self isOnMacintosh ifTrue: [^'gciForMacintosh.so'].
self isOnLinux ifFalse: [self error: 'Unknown platform!'].
path := SmalltalkImage current vmPath , 'so.gciForLinux'.
(FileDirectory new fileExists: path) ifTrue: [^'gciForLinux'].
Once this is done, the client should be able to interact with the GCI library. We should be able to update GemTools with this method soon.
This video introduces the topic of garbage collection (GC), one of the more complex subjects in GemStone Smalltalk. There are two types of GC: (1) in a local virtual machine (the Gem), and (2) across the entire Repository (persistent objects). “Live” objects are objects that are referenced from a persistent root. “Dead” objects are those that are not referenced from a persistent root. To identify live objects, we need to start from a system-defined well-known root (AllUsers, the collection of UserProfile instances) and identify each referenced object. This process can take hours or days.
This video describes the ‘Stone’ process in the GemStone Smalltalk system. The Stone allocates resources (object identifiers, free extent pages in the repository). The Stone manages the object lock requests. The Stone manages gem-to-gem communication. The Stone manages the commit process, including notifying gems that are interested in changed objects. The Stone supervises garbage collection, including the commit record backlog. The Stone manages checkpoints, the act of flushing all dirty pages from the shared page cache to the extents.
This video describes Reduced Conflict classes. As has been discussed earlier, if two sessions edit the same object, the second one to attempt to commit will get a write conflict and the commit will fail. For certain collections, like Bag and Dictionary, this is a problem because multiple sessions might want to add items to a collection (think of a collection of customers or orders). GemStone Smalltalk provides a few special classes that allow for operations on an object with reduced risk of conflicts. We discuss an RcIdentityBag, an RcCounter, an RcQueue, and an RcKeyValueDictionary.