You are currently browsing the category archive for the ‘Uncategorized’ category.

I’ve been looking at various integrated development environments (IDEs) and thinking about how they could contribute to the Smalltalk ecosystem. Visual Studio Code is an IDE that is cross-platform and is extensible.

The VSCode concept of a workspace is a bit confusing and some of the challenge is the overlap between a folder and a workspace. In this blog post I’ll try to clarify some of the terms.

File

When you open one or more files, most VSCode features are available but it does not save settings or reapply them when the files are reopened.

Folder

In addition to opening files, VSCode can open a folder and allows the user to perform various file management operations (add, delete, move, copy, rename, etc.). VSCode provides a tree view that can be used to navigate the file hierarchy.

A folder is an implicit workspace (see below) and you can configure the IDE with local settings. These settings are saved in ./.vscode/settings.json and when the folder is reopened, the settings will be reapplied.

Workspace

When dealing with a single folder (see above), VSCode saves settings in a hidden folder in the root folder. VSCode also has the concept of a workspace that can define an ordered collection of folders and local settings. The workspace definition and configuration is saved as a <name>.code-workspace file (this file doesn’t need to be in the folder, but could be). A workspace can contain zero or more folders.

No Folders

If there are no folders in a workspace, then the definition simply contains local configuration settings. This might be useful if you want to use the IDE to work with various individual files and use custom configuration settings across the various files.

One Folder

If there is exactly one folder in a workspace, then it is similar to the Folder mode (see above), but instead of a hard-coded file in a hidden folder in the root folder, the settings can be saved outside the folder and opened directly from the operating system file explorer.

Multiple Folders

VSCode documentation refers to a workspace with two or more folders as a Multi-root Workspaces. The typical use case is when you are working with multiple folders in unrelated places (e.g., code in /opt and settings in /etc).

A somewhat surprising characteristic of a multi-root workspace is that the first folder carries special significance (presumably a carry-over from the one-folder implicit workspace model). In order to maintain the proper value of rootpath (a deprecated read-only internal variable), a change in the first folder causes the window to be reset:

If the first workspace folder is added, removed or changed, the currently executing extensions (including the one that called this method) will be terminated and restarted so that the (deprecated) rootPath property is updated to point to the first workspace folder. [emphasis added]

https://code.visualstudio.com/api/references/vscode-api

If the extension holds any resources (e.g., a socket to a database), then those resources will be lost. Take care!

Booch’s classic book on OO (now in its third edition with five co-authors) is an excellent introduction to object-oriented concepts. I read the second edition in the 1990s when I was learning Smalltalk and I find myself still visualizing some of the images or figures used to illustrate the concepts (especially the cat) when I think about OO. This year I taught a 10-week undergraduate course on OO and used Booch’s book as a primary text (I also use a Booch video on the history of computing in a survey class). This gave me a chance to revisit the text in detail a couple decades after I was introduced to it.

Although the book has aged well, I found that there were a few areas where I wanted to respond. These comments or criticisms should be taken in the larger context of my appreciation for the book and its influence on me and others.

Aggregation

Aggregation is introduced in the discussion on Relationships among Objects (section 3.2 of the 3rd edition). Booch describes “two kinds of objects relationships [that] are of particular interest”, links (denoting a peer-to-peer relationship) and aggregation (denoting a whole/part relationship).

Booch asserts (p. 91), “By implication, an object that is part of another object has a link to its aggregate [emphasis added].” I believe that the the link is from the aggregate (whole) to the part. I’m not sure if this is a typo or if I’m not understanding the vocabulary.

Booch continues with a discussion of “physical containment”:

Aggregation may or may not denote physical containment. For example, an airplane is composed of wings, engines, landing gear, and so on: This is a case of physical containment. On the other hand, the relationship between a shareholder and his or her shares is an aggregation relationship that does not require physical containment. The shareholder uniquely owns shares, but the shares are by no means a physical part of the shareholder. Rather this whole/part relationship is more conceptual and therefore less direct than the physical aggregation of the parts that form an airplane. [p. 92]

Is this simply suggesting that objects can represent both physical and conceptual things? If so, then why discuss it as part of aggregation? Or is there more intended here? Is there a type of aggregation that includes ‘containment” and a type that doesn’t? If there is “physical containment,” is there also “non-physical containment”, or is containment only (always?) for physical objects? Could you have aggregation of physical things (say, a fleet of vehicles) without them being bolted together? The suggesting that some types of aggregation are “less direct” seems to allude to the aggregation vs. composition discussion (ownership and lifetime), discussed next.

Finally, section 3.2 ends with the assertion that “There are clear tradeoffs between links and aggregation [emphasis added].” I don’t see the clear distinction, or how links permit looser coupling.

Physical Containment and Composition

Section 3.4, Relationships among Classes, returns to the topic of aggregation, and under the heading Physical Containment, explains that with “physical containment … [an] object does not exist independently of its enclosing” aggregate. “Rather, the lifetimes of these two objects are intimately connected” such that if the whole is destroyed, by implication we also destroy the part. [p. 110] But returning the the aircraft example from page 92 above, it doesn’t seem to be necessary that an engine’s lifetime (or that of the seat-back entertainment system) is tied to the airplane on which it is installed. It can be (more or less easily) unbolted and moved to another aircraft.

Continuing on page 110, Booch introduces “A less direct type of aggregation …, called composition, which is containment by reference. … [Here] the lifetimes of these two objects are not so tightly coupled as before: We may create and destroy instance of each class independently.”

A few chapters later, in section 5.7, Class Diagrams, Booch returns to a discussion of aggregation (to describe the notation). Here, “composition (physical containment)” is described as parts that “are defined as having no meaning outside the whole, which owns the parts; their lifetime is tied to that of the whole.” [p. 196]

So, on page 110 composition is “not so tightly coupled” and objects can be created and destroyed independently. But on page 196 composition is equivalent to physical containment and tightly coupled. I think that the later discussion is consistent with the remainder of the book (and with the outside references I’ve found), so I think that the earlier discussion is wrong.

Install Android Studio and Emulator

Download Android Studio and copy to your Applications folder. Launch it and complete the install process using the provided wizard. Using the Configure menu at the bottom the welcome screen, open the SDK Manager.

Confirm the SDK location (we will need this later) and that you have the Android 9.0 (Pie) SDK installed.

Close the SDK Manager and open the AVD Manager (from the Configure menu on the Welcome screen).

Create a new Virtual Device accepting the default device.

Select the Pie system image (downloading it if necessary).

On the last page confirm the setup and click Finish.

Once you have a device, click the green triangle to start the emulator.

Once started, the emulator should show the phone starting up.

Install React Native

Install Node.js. It is helpful to have certain packages installed “globally” but it is considered a bad practice to use sudo for this. Instead follow these instructions. Install the remaining tools using the following command:
npm install -g react react-native react-native-cli

Build an App

From a command prompt, enter the following:
react-native init HelloWorld
cd HelloWorld
react-native run-android
This will open a new Terminal with the title “Running Metro Bundler on port 8081.”

Missing JDK

At this point I got an error that I need to install the Java Developer Kit.

Download the JDK.

Open the package and step through the install process. Then try the startup again:
react-native run-android

“SDK location not found”

“SDK location not found. Define location with sdk.dir in the local.properties file or with an ANDROID_HOME environment variable.

As we did to handle the global install of Node.js packages, we will add a line to .profile with the SDK, source the file, and try again:
echo “ANDROID_HOME=/Users/jfoster/Library/Android/sdk” >> ~/.profile
source ~/.profile
react-native run-android

This should have solved the problem, but didn’t for me. So, I added a line to the local.properties file:
echo “sdk.dir=/Users/jfoster/Library/Android/sdk/” >> ./android/local.properties

“adb: command not found”

The next error I got was “adb: command not found”. It turns out that there are some executables that are installed but not found. So we need more updates to our environment variables:
echo “export PATH=$ANDROID_HOME/platform-tools:$PATH” >> ~/.profile
source ~/.profile
react-native run-android
Finally, we get the app running!

Customize the App

Edit App.js to customize the message, save the file, and then in the emulator double tap R to reload the app.

Remote Debugging

Install RNDebugger to support debugging. With the emulator in the foreground, press <Ctrl>+<M> on the keyboard. This should bring up a developer menu. Select “Debug JS Remotely” the Reload (either from the menu or double tap R).

Add some debugging code to App.js; for example, add a line to the end of the file:
console.log(“This is a message from my code to the console”);
Save App.js, then in the emulator double tap R to reload the app. You should see your message in the debugger.

If in Doubt, Restart

Not infrequently things don’t seem to work right and I close the Metro Bundler terminal, quit the app on the phone (tap the share button at the bottom right on the phone then swipe the app up and away), then restart the app (react-native run-android). You could also try quitting the emulator and restart it as well.

Common error resolved below:
– “ERROR: JAVA_HOME is not set”
– “SDK location not found. Define location with sdk.dir in the local.properties file or with an ANDROID_HOME environment variable.”

Introduction

I’m presently teaching a class on mobile application development and have decided to share my notes about how to get things working. For this exercise I’m starting with a clean install of Windows 10 64-bit and I will walk through the steps I took, including the errors I encountered (so that if you get the same errors you might find this blog post!).

Install Android Studio and Emulator

Download Android Studio and install both Android Studio and Android Virtual Device. Check the box to install the Android Virtual Device. Note that the dialog warns that the SDK location should not contain whitespace:

:

So, I’ll just put the SDK into C:\Android.

The download and unzip process takes some time. When it finishes we have the Welcome screen. At the bottom there is a drop-down menu:

Select the SDK Manager command to confirm that you have installed an SDK in a path without a space. Make a note of this since we will need it for the ANDROID_HOME environment variable later:

Select the AVD (Android Virtual Device) Manager command to view the emulator that got created for you. Note that this is a “Google API” emulator. We may want to create another one with “Google Play”.

Click the green arrow under the Actions column to start an emulator. I got a warning that that the emulator is using a “compatibility renderer”, which is fine with me. The emulator will show in the Task Manager as “qemu-system-x86_64.exe” and will take a lot of memory and CPU!

Install React Native Tools

Install Node.js accepting all the defaults. When that finishes run the following in a command shell:
npm install -g react react-native react-native-cli

Build an App

From a command prompt enter the following:
react-native init HelloWorld
cd HelloWorld
react-native run-android
This will open a new “node” window with the title “Running Metro Bundler on port 8081.”

If you get a Windows Security Alert asking if you want allow an application to communicate on the network, click “Allow access”:

Errors from Missing Environment Variables

On the original command prompt, you may see the following:
“ERROR: JAVA_HOME is not set and no ‘java’ command could be found in your PATH.”
To solve this open your system settings and add an environment variable for
“JAVA_HOME”
with the value
“C:\Program Files\Android\Android Studio\jre”
Close the node window and reopen the command shell and try again
cd HelloWorld
react-native run-android

The process should get further but is likely to give you an error:
“SDK location not found. Define location with sdk.dir in the local.properties file or with an ANDROID_HOME environment variable.”
Just as we did with above with JAVA_HOME, we now need to set an environment variable for
“ANDROID_HOME”
with the SDK location selected above and found on the SDK System Settings:
“C:\Android”
Close the node window and reopen the command shell and try again
cd HelloWorld
react-native run-android

Customize App

If all goes well this should have a successful build and the Metro window should show a build process and the emulator should show “Welcome to React Native!”.

Open App.js using a text editor:

Replace “Welcome to React Native!” with something personal and save the file.

Return to the emulator and “Double tap R on your keyboard to reload”.

Remote Debugger

Install Chrome to support debugging. With the emulator in the foreground, press <Ctrl>+<M> on the keyboard. This should bring up a developer menu. Select “Debug JS Remotely”:

When Chrome opens, press <Ctrl>+<Shift>+<J> to open the console.

Add some debugging code to App.js; for example, add a line to the end of the file:
console.log(“This is a message from my code to the console”);
Save App.js, then in the emulator double tap R to reload the app. You should see your message in the Chrome console.


Update: See this Docker image.

People sometimes ask about making GemStone available in a Docker container. The demand seems to be driven by a desire for a simplified install/deployment process. In preparation for my ESUG talk on a cloud-hosted GemStone IDE I decided to do some investigation.

Docker is an alternative to a virtual machine (which I’ve written about extensively on this blog) where the guest environment (“container”) shares not just the hardware with the host and other guests, but also the OS kernel. What is isolated is the application and supporting libraries. It does allow simplified deployment and ensures that each installation has exactly the same libraries and other components. The container can also be isolated so that (by default) it does not write outside its boundaries. These are attractive features.

Docker

A typical description of Docker suggests that each container uses the host OS, but it really uses the host OS kernel, and on macOS it runs an embedded Linux instance. Thus, a single Docker container can run on various OS hosts. Contrary to my initial misconception, you don’t need a separate Docker container for every OS and version.

A more challenging issue is what should go in the container. Obviously, it should include GemStone (and you would need a different container for each GemStone version), but what else? Should it include a web server? If so, which one? Perhaps not, but it does increase the complexity if you need to install, configure, and coordinate multiple components.

One hurdle is that once built, the software in a container is essentially fixed. You do not upgrade the contained software, you replace the container. Furthermore, I hope your container does not have any persistent data – it’s not supposed to, meaning thou shalt not run a database inside a container. Or, at least, not a production database; running a development database inside a container might be a good idea.

More specifically, you need to make sure that any persistent data is held outside the container, meaning that your container is not so isolated. Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. And you would still need to manage backups and related system administration tasks.

For things like Topaz (a command-line interface to GemStone), you could run a command in the container. For RPC Gems (used by most client applications and IDE tools such as Jade), you should need to have only one port open into the container since the server-side Gem would be inside the container as well.

Overall, I don’t find it very difficult to install GemStone, but if someone were more likely to investigate GemStone if a Docker container were available, then it shouldn’t be too difficult to provide it.

A question on the GLASS mailing list raised some interesting questions about garbage collection and persistent executable blocks. Some objects, such as instances of SortedCollection, may contain a reference to an executable block and this persistent reference may be difficult to update and may hold references to other objects that otherwise could be removed by the garbage collection process.

Simple vs. Complex Blocks

The first issue raised by the question is the distinction between a “simple block” and a “complex block.” In Smalltalk a block may reference variables outside the block and, if it does so, then the virtual machine needs to create a more complex data structure to capture the referenced objects (making it a “complex” block). If the block does not reference any variables outside its immediate scope (beyond the block arguments and block temporaries), then the virtual machine can typically use a much simpler implementation (making it a “simple” block).

Because of the extra overhead, complex blocks tend to take more room and be slower to execute. Thus, a common performance tuning activity is replacing complex blocks with simple blocks when possible, and this can occur in application code as well as vendor-supplied libraries. For example, in GemStone/S 64 Bit version 2.x, the implementation of SortedCollection>>#’addAll:’ is essentially the following:

aCollection do: [:each | self add: each].

The problem with this implementation is that the block references self, making it a complex block. In version 3.x the implementation changed (influenced in part by Paul Baumann’s work), and the code is essentially the following:

aCollection accompaniedBy: self do: [:me :each | me add: each].

The block in this code is a simple block since it does not reference self (or anything outside the block), and performance improved.

While this is all very interesting (and in some cases quite useful), I think that the underlying garbage collection problem is not really with simple vs. complex blocks, but with something else.

Class References in Blocks

A code block in a method will contain a reference to the method and thus to a class (or metaclass for a class-side method). Once the code block is persistent, there is a hard reference to the class (and its class variables, class instance variables, and pool variables). Changes to the class schema will create a new “version” of the class but the old version will remain and be referenced by the code block. Even if the old class version is removed from the ClassHistory and all instances of the old version are migrated to the new version, until the code block is replaced in the persistent object, there will be a hard reference to the old class version and its space cannot be reclaimed by the garbage collection process.

To see how this works consider the method SortedCollection>>#’_defaultBlock’:

| block |
block := SortedCollection new _defaultBlock. "anExecBlock2"
block _debugSourceForBlock. "'[ :a :b | a <= b ]'"
block method. "aGsNMethod"
block method isMethodForBlock. "true"
block method inClass. "SortedCollection"

Note that this situation exists whether or not the block is simple or complex and whether or not the block comes from an instance-side method or a class-side method. Thus, this is an orthogonal discussion or a unique concern.

Alternatively, note how an equivalent block can be created by evaluating a String, but does not contain a reference to a Class:

| block |
block := '[ :a :b | a <= b ]' evaluate. "anExecBlock"
block _debugSourceForBlock. "'[ :a :b | a <= b ]'"
block method. "aGsNMethod"
block method isMethodForBlock. "true"
block method inClass. "nil"

So, as suggested one could strip the class reference with the following code:

^ aBlock _sourceString evaluate

Alternatively, one could avoid the private method (#’_sourceString’) by making the String explicit:

^ '[ :a :b | a <= b ]' evaluate

While this avoids the private method, it hides the code in a string making it much more difficult for tools to recognize that this is code (and catch compile errors, support senders/implementors, provide for refactorings, etc.). Further, each of these approaches will create a new instance each time it is called, creating unnecessary objects.

As Dale noted in the email chain referenced above, we could modify the compiler so that a block did not know the class in which it was created. This might improved the garbage collection situation, but would make it more difficult to debug code and track a block back to its source. And, of course, you would need to wait till that feature was added to the product.

One way to address these issues is to create a new class with no instance variables (so that the schema would never change) and use class-side methods to return the desired blocks. The method that would otherwise hold the code block would use a level of indirection to return the actual block. If a code block needed to change, you would create a new class-side method in the special class and edit the indirecting method (so that references to the old code block could be found and modified). The original method would remain as long as there were references to it.

At the cost of a level of indirection (and reduced encapsulation), this gives you actual source code that can be recognized by tools, a single instance, and no stray references from the class (since there is only one class version).

[Note: I found this in “Drafts” and decided to publish it even though the original question was a couple years ago.]

 

A video of my presentation on GemStone/S and SQL is available here. Unfortunately, the audio is not very strong.

GemStone/S has a configuration STN_TRAN_FULL_LOGGING that, if set to TRUE, causes all transactions to be recorded to the redo-log. In this mode a backup plus transaction logs can be used to recover if the extents are lost. If the configuration is set to FALSE, small transactions are still logged but large transactions cause a “checkpoint” in which all data is written to the extents before the transaction completes. In this mode recovery from a crash is possible if all the extents and the most recent transaction log are available (older logs are typically deleted automatically in this mode). The logs themselves are not sufficient to allow recovery from a backup in partial logging mode, so partial logging mode is inappropriate for production (but handy sometimes for development since old logs are typically deleted automatically).

In some applications or situations it would be nice to segregate transactions into those that should be logged for crash recovery and those that can be easily recreated if needed (and avoid the overhead of time and space in writing to the log). For example, it is common to do a bulk-load of new data before it is needed by the application. If the system crashes, we’d prefer to recover the live data quickly and reload the new data later. With Oracle one can specify a table as being NOLOGGING and then do a series of “Direct-Load INSERT” commands that bypass the rollback/replay log. A key limitation with this capability, however, is that it does not support referential integrity. Likewise, with SQL Server one can do a BULK INSERT that will, under some circumstances, have minimal logging. By default, this will leave the constraint on the table marked as not-trusted.

For a relational database, a foreign key is a value like any other (typically an integer or string) and if a JOIN fails to find a match then the matching data is NULL or the row is ignored. In GemStone/S, an object reference is guaranteed to be valid and there is no easy way to handle data that lacks referential integrity. Thus, if an object is visible in the database (e.g., from a bulk load), then any session can, in a transaction, create a reference to it. If the reference is in the transaction (redo) log, then the object must itself be in the transaction log. Therefore, because in GemStone/S referential integrity is required, not optional, the relational approach to avoiding the transaction log is not safe.

The relational solutions described above for reducing transaction log activity allow for a bulk load of data that can, eventually (following a full back, recreation of indexes, and reapplication of any constraints) be equivalent to all other data. This fits an application upgrade (bulk load) use case quite well.

Another possible use case is short-lived data that is to be shared by multiple sessions but will not be saved over the long-term. One example is transient data where a summary is to be kept but the raw data discarded after the analysis. Another example is information about logged-in sessions with in-progress activity (such as Seaside session state) that could be safely lost in a system crash (presumably the users would need to start over anyway). This use-case does not present the same referential integrity concerns and GemStone/S is developing a feature that is intended to address this use-case: a SymbolDictionary in Globals named #’NotTranloggedGlobals’. The idea is that any object referenced from this root would not be recorded in the transaction log and should not be referenced from any other path. At present this is still a work-in-progress and customers are advised to not use it.

Periodically the International Earth Rotation Service determines that Coordinated Universal Time needs to be adjusted to match the mean solar day. This is because the earth doesn’t rotate at a constant speed and in general takes slightly longer than 86,400 seconds. These adjustments have been done through adding (or, in theory, removing) a second every few years. The next leap second insertion is scheduled for June 30th, 2015 at 23:59:60 UTC. How does GemStone/S handle that?

GemStone gets time from the host OS (Unix Time) and is typically described as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970. This definition is correct so long as we assume that each and every day has exactly 86,400 seconds. If you take into account the 27 leap seconds that will have been added as of June 30, 2015, it is more accurate to say that Unix Time is the number of atomic seconds that have elapsed since 00:00:27 UTC, 1 January 1970. The way the adjustment occurs is that when there is a leap second, the Unix time counter takes two atomic seconds to advance by one Unix second.

The impact of this approach is that a record of when something happened will be correct, but the time between two events could be reported as being one or more seconds less than the actual time between the events. For example, Unix time (and GemStone) will report that there are five seconds between June 30 at 23:59:55 and July 1 at: 00:00:00 when in fact in 2015 there were six seconds between those two times.

Whether this matters is an application-level or domain-level problem. If keeping track of the time between events needs to be accurate (e.g., recording some rapid physics event), then relying on Unix time (or GemStone) will not be sufficient. If all that is needed is a timestamp and it is acceptable to have June 30, 2015 23:59:59 UTC last for 2000 milliseconds, then things should be fine.

Categories