Saturday, December 27, 2008

The Ad-Hoc Approach Can Work, Sometimes

Joel Spolsky's introduction to his article How Hard Could It Be?: The Unproven Path at Inc.com states that he has broken seven rules about creating a technology venture. All the rules (except maybe the last one) also pertain to executing software development projects. If I were to draw a conclusion from this article it is this: you can, sometimes, create useful quality software in a reasonable time frame without following established approaches --- i.e. using an ad-hoc approach can work. The article gives one example where this approach succeeded. A few of my own projects and some of my colleagues have also succeeded in spite of breaking most of these rules. I'm welling to bet a great percentage of the single person or small team open source projects have also been successful using the same approach.

Don't get me wrong, I'm not advocating the ad-hoc approach for software development projects. In fact, even suggesting so leaves a bad taste in my mouth and gives me a sick belly. But I have to be realistic, for certain types of projects, it does work. I believe the success of using this approach depends heavily on the project team and the nature of the project. Off the top of my head, here are a few ingredients the project should have to use the ad-hoc approach successfully.

  • The team must be small. The larger the team the less chance the ad-hoc approach will be successful.
  • The entire team must consist of excellent developers: smart, experienced, knowledgeable, passionate (Joel states this in a slightly different way at the end of his article).
  • The team must have a deep, intrinsic understanding of the requirements for the software.
  • The team must believe in and take ownership of the project.
  • The entire project should be built in-house --- minimum amount of collaboration with and dependencies on external teams, especially those from other companies.
  • The software is being built from the ground up.
  • The product is not an "enterprise solution" built in an IT shop where
    • lots of red tape, bureaucracy, hierarchy, turf wars, and office politics can get in the way; and,
    • the software being built is not tightly coupled with lots of other enterprise software.

With these points in mind, lets revisit Joel's seven broken rules (sometimes liberally paraphrased) and try to identify when and why these rules can be broken without negatively effecting the project. (The following comments are opinions based on my experiences, no research on their validity has been conducted.)

Hire only the best programmers. I whole heartedly agree with this rule for all projects: it can't be broken without consequences. Joel thinks he has broken this rule by hiring Jeff Atwood and Jeff's team of developers without checking if they "could write good code". I don't think he really broke this rule at all since Jeff's reputation has being an intelligent, passionate, good developer is generally accepted by the development community. There is very little risk in hiring a developer as well respected as Jeff.

All developers and management staff must be in the same office. When developing software before mid 90's, this rule could not be broken. This is no longer the case, especially for projects with a small number of developers. All team members can effectively and efficiently communicate, perhaps more so, using readily available communication and collaboration technology, including the basics such as email, chat, video chat, twitter, wikis, phone, video phone, and video conferencing. As an example, consider 37 Signals, a company that embraces the distributed model in addition to building tools to help support it.

Plan before proceeding. Planning is always done, but when, how much, how formal, by who, and if it's documented or not can change for each project. For projects that meet most of the criteria outlined above, it is sometimes OK to do minimalistic up front planning, as opposed to complete project plans where the entire project is broken down in many fined grained tasks and extensively documented. The project plan can occur in the team leaders heads, perhaps with some notes so things are not forgotten (but with minimum detailed formal documentation), and keep at a very high level. The plan, written down or not, is given just enough detail to get the project going and keep it going from one major task (short phase) to the next. The plan is iteratively fined tuned and changed as the project proceeds. Task level planning also occurs as developers plan how best to tackle the next new task. Again, this is very informal and may not be documented.

Issue tracking. For new projects you can probably make do without a formalized tracking system, mainly because most issues (e.g. bugs) will be identified after the project is done and the product is in use. Nevertheless, I would recommend some place where developers can note potential improvements and known unresolved issues they encounter during development. In the most basic form, these items can reside in code comments flagged with tags like TODO (which can be automatically extracted into to TODO file using a simple script), or in a simple TODO.txt file at the root of the project. For long term products with on going maintenance and feature development, and where many different developers have to work on the product, issue tracking is a must. A sophisticated system may not be necessary, but ideas, bugs, issues, features, and tasks need a centralized place to live so

  • issues are not forgotten as different developers work on the product over time;
  • a place exists to record discussions and information gained about the issues;
  • issues can be organized (in particular they can be prioritized);
  • issues can be effectively communicated to all team members;
  • responsibility for issues can be assigned, tracked, and communicated; and
  • it help managers track issues and their status, progress on addressing them, and in planning for new phases of development.

Test software before releasing it. Unless you have perfect developers, I don't think this rule can be broken. What changes is who does the testing, how formalized it is, the types of testing performed, and when it is done. I would argue that for projects with really good, experienced developers, you can probably minimize the amount of formalized testing since (1) the bug per line of code introduced during construction will be low (compared to inexperienced, bad developers), and (2) these developers will do significant testing as they develop. For small or new projects that developers are passionate about, they will also do the end user testing themselves: no formalized testing teams necessary. In some cases, it may be appropriate to release the software in alpha and beta states to get user testing in the field, further reducing the need for formalized testing and test teams. In many cases, such as mission critical software and software which must interface to existing system (especially legacy ones), this ad-hoc approach may not be sufficient nor efficient.

Create schedules to ensure a project is delivered on time, within budget, and meets the requirements. For some projects, especially new projects where the software being built is charting unexplored territory, detailed schedules may not be necessary or of little use because the requirements and/or the tasks that need to be completed and the complexity of them is not well understood (See Frequently Forgotten Fundamental Facts about Software Engineering). Estimates are generally so poor that they are often no better than a general all encompassing guess anyway. Alternatively, the project might be small, will understood, and the developers are very experienced in building similar software, so developers know exactly what needs to be done and can give an accurate encompassing estimate such as "6 to 8 weeks" thus reducing the need for detailed schedules.

Decide on how to make a profit before building the product and measure its success by profit. This rule is about the business side of software development, so I won't comment on when and why you can break it. I will state that a project can be classified successful without making a profit. For example, it is useful to someone, it has quality, and it meets its requirements. Many open source projects are definitely successful, but they make no profit. The point Joel makes about building useful software first, then worry about making profit from it later, reminds me a lot of Steve Yegge's article Business Requirements are Bullshit.

References

  1. Joel Spolsky, How Hard Could It Be?: The Unproven Path, Inc.com, 2008.
  2. Steve Yegge, Business Requirements are Bullshit, 2008.
  3. Robert Glass, Frequently Forgotten Fundamental Facts about Software Engineering, IEEE Computer Society, 2008.

Wednesday, December 17, 2008

RIAs and the Future of the Open Web

In The Struggle for the Soul of the Web Chris Keene writes:

Just because the web has been open so far doesn't mean that it will stay that way. Flash and Silverlight, arguably the two market-leading technology toolkits for rich media applications are not open. Make no mistake - Microsoft and Adobe aim to have their proprietary plug-ins, aka pseudo-browsers, become the rendering engines for the next generation of the Web.

and

The worse the underlying browser is at rendering rich widgets and media, the more developers and users will want your plug-in. If you are both the vendor of a browser (say IE) as well as the proponent of a plug-in (say Silverlight), then the incentives get truly twisted.

I believe this is why Microsoft (MS) has hijacked JavaScript 2.0 (I can't figure out why Yahoo didn't support it) and is hijacking HTML 5. These standards would enable other companies to more easily develop rich widgets and media frameworks that would compete directly with Siliverlight. From MS's point of view, that is unacceptable. Although both Adobe and MS have "pseudo-browsers", MS is a much bigger threat to an open and standard internet because they also create one of the worlds leading browsers, Internet Explorer (IE) --- the potential for proprietary lock-in here is huge.

Consider what happens with increasing adoption of Silverlight. First, Silverlight no longer works so well in non-IE browsers, then a year or two later it's only supported by the top two browsers, and once a critical mass is achieved, it only works in IE. I can see the website banners now: "viewable only using IE and Silverlight". What browser do you think most people will be forced to use? And since IE is tied to Windows, websites will essentially become Windows OS dependent. Moreover, developers will be forced to use the .NET framework for Silverlight, that leads to developers having to use MS development tools on a MS OS... where does it end? Why would MS even care about or need open web standards once all users are locked into using IE+Silverlight+Windows?

I think Silverlight --- the VM and its development tools --- is a great solution for RIAs. However, I'm a strong supporter of open standards that benefit all of our industry. As such, I will avoid, as much as possible, Silverlight for fear I may contribute to the demise of our open web.

References

[1] Chris Keene, The Struggle for the Soul of the Web, AjaxWorld, ajax.sys-con.com, Dec, 2008.

Saturday, December 13, 2008

Great Developers Need Great Office Space

As written by Jeff Atwood for Stack Overflow podcast 31 [1]:

Joel [Spolsky] justifies having a nice office space as 1) a recruiting tool 2) enabling higher programmer productivity and 3) the cost of a nice office space is a tiny number relative to all your other expenses running a company. I [Jeff] argue that companies which don't intuitively understand why nice office space is important to their employees who spend 8+ hours every day there... well, those companies aren't smart enough to survive anyway.

Enough said.

[1] Podcast #31, Stack Overflow Blog, blog.stackoverflow.com, 2008

Tuesday, December 9, 2008

Methods for Speeding Up Your Website

YAHOO Developer Network has an excellent article (Best Practices for Speeding Up Your Web Site) that describes 34 ways to improve your website performance. You can also see live examples of 14 of these items at 14 Rules for Faster-Loading Web Sites.  The article is especially pertinent to developers who make heavy use of JavaScript+AJAX, HTML, and CSS (as opposed to Flash, for example) in their websites.
Following is a listing for easy reference: Make Fewer HTTP Requests; examples here Use a Content Delivery Network; examples here Add an Expires or a Cache-Control Header; examples here Gzip Components; examples here Put Stylesheets at the Top; examples here Put Scripts at the Bottom; examples here Avoid CSS Expressions; examples here Make JavaScript and CSS External; examples here  Reduce DNS Lookups Minify JavaScript and CSS; examples here Avoid Redirects; examples here Remove Duplicate Scripts; examples here Configure ETags Make Ajax Cacheable Flush the Buffer Early Use GET for AJAX Requests Post-load Components Preload Components Reduce the Number of DOM Elements Split Components Across Domains Minimize the Number of iframes No 404s Reduce Cookie Size Use Cookie-free Domains for Components Minimize DOM Access Develop Smart Event Handlers Choose <link> over @import Avoid Filters Optimize Images Optimize CSS Sprites Don't Scale Images in HTML Make favicon.ico Small and Cacheable Keep Components under 25K Pack Components into a Multipart Document

Friday, December 5, 2008

Why Use Abobe Flex?

Here are some reasons you might want to use Abobe Flex (Flash) to build the UI (or some part thereof) for your next web application:
  • It's an open source development kit.
  • It comes with everything needed to build and deploy Flex UIs. It does not include an IDE, but you can buy an IDE or plug-in for Eclipse to dramatically ease development.
  • It is supported and principally developed by Adobe. Hence, it has commercial backing with invested cooperate interest. Some would argue this lowers the risk of Flex becoming vaporware anytime soon. Adobe also offers commercial support.
  • It has a large and growing development community. For example, it has a sister open source project for unit testing called FlexUnit.
  • It is used by many large and small corporations.
  • It is based on standards (at least we can argue it is; JavaScript 2.0, HTTP, XML, etc.) and proven technology, such as Flash.
  • It has extensive documentation.
  • It has a large library of existing UI and non-UI components. One can also buy libraries of custom components (e.g. charting and graphing).
  • It is flexible and extensible. For example,
    • it can be integrated into any existing web page without completely taking over that web page (it can do that too if desired!);
    • all GUI and non-GUI components can be extended and customized;
    • it can access a number of back end data sources and application frameworks, including Java Servlets, Flex Data Services, and REST; and
    • it can use different communication mechanisms, including raw TCP/IP sockets, HTTP, and SOAP.
  • It works well with Java back ends, such as servlets and EJBs. Hence, you can harness existing expertise and code base with few changes.
  • The development model is easy to use and understand. For example, you can use XML to layout GUI components, JavaScript 2.0 (which is Java like, at least more so than JavaScript 1.5) for attaching behavior to the components, and a simple set of APIs for accessing the server.
Although this is supposed to be a post about why you should use Flex, I have to add this negative point because it caused me no end of grief in my last project: Flex/Flash does not have a component to display standards compliant HTML/CSS. So, if you're like me, and you have some existing HTML content you want to display in the UI, you're out of luck. There are some options available, such as rendering content outside Flash and overlaying it on Flash using IFrames, but I was never satisfied with them. Here are some links about using such approaches:

Wednesday, December 3, 2008

Conditional CSS

Even though CSS is a standard, there are differences in how web browsers render and support CSS. By far the biggest deviant is IE6, which has the poorest compliance of the major browsers I support (FireFox, IE, Safari). In some cases you can simply tolerate the differences, but more often than not you have to find workarounds to convince the non-compliant browser to do what you want. I generally target FireFox first to get what I want (because it has good standard compliance and has FireBug) and then tweak the CSS to workaround IE issues. Following are two methods to conditionally include CSS depending on the target browser. CSS Hacks The first method is to exploit CSS parsing bugs in browsers (mainly IE) to accept or ignore CSS attributes. This is often called CSS hacking. Although commonly used, it is not recommended [1,2]. Two common hacks prefix CSS attributes with a special character to select browser and browser version: *attribute -- for IE 7 and below (I haven't tested on IE 8) _attribute -- only IE 6 and below attribute -- all other browsers Example: div { *width: 20px; _width: 20px; } Conditional Comments The second method is to use conditional comments [1,2]. This relies on a feature in IE to conditionally include HTML and CSS content using special commands embedded in comments. This is the recommend method because it does not depend on bugs to work. However, it is not as tidy or simple to use as CSS hacks and it can't be used for other browsers. Furthermore, it relies on modifying the HTML source to work. The following example includes the CSS resource file main.css if the browser is IE 6. <!--[if IE 6]> <link rel="stylesheet" type="text/css" href="main.css" /> <![end if]--> You can also use comparators such as lt and gt. This example only includes the CSS file if the browser is IE 7 or less. <!--[if lte IE 7]> <link rel="stylesheet" type="text/css" href="main.css" /> <![end if]--> References [1] http://www.javascriptkit.com/dhtmltutors/csshacks.shtml [2] http://www.quirksmode.org/css/condcom.html

Saturday, November 29, 2008

Website SSL Certificates

Understanding Web Site Certificates [1] has a nice succinct description of website certificates. In summary, a website certificate is used to identify a secure web site, in the sense that it is a trusted web site (e.g. not a phishing site), and data being transmitted and received to and from your browser is secure (e.g. encrypted using SSL). Trusting a certificate means you are trusting one authority from a list of certificate authorities known by your browser to have verified the web site you are visiting is legitimate and secure. Although rare, this process has been known to fail. Brian Krebs in The New Face of Phishing [2] described a sophisticated phishing scam that used a valid SSL certificate issued by a "trusted" authority. If you visit a website that has a certificate signed by an organization untrusted by your browser or the certificate contains an error (e.g. certificate has expired), the browser displays a dialog prompting you to decide if you want to accept the certificate [1]. Before accepting a certificate, ensure it
  • has a valid and trusted issuer, such as Verisign,
  • has not expired, and
  • has been assigned to the web site organization you are visiting.
If this dialog is not displayed, say because your browser accepts the certificate, you can still manually examine the certificate if you wish. Normally this can be done by clicking on some visual indicator on your browser while you are on the protected site. Don't just assume that because a website is protected by a certificate that site must be legitimate. Some phishing sites have used self-signed certificates to create the illusion of legitimacy [3]. It is the site authors hope the unwary visitor would be tricked into believing that because they have been given a certificate, the site is secure so they can safely submit their personal information. A web site who issues a certificate to itself should always be viewed with some suspicion [4]. Do you need SSL certificates for intranet (internal only) websites? If you are transmitting sensitive information between browsers and servers that some employees should not see (e.g. passwords), then yes. This assumes you believe your employees are malicious enough to start snooping for such confidential information. What about phishing? This may be less of an issue because the phisher would need to know the look and feel of your internal website in order to mimic it convincingly. But if such information can be obtained, then SSL certificates would be useful. References
[1] Mindi McDowell and Matt Lytle, National Cyber Alert System, Cyber Security Tip ST05-010, Understanding Web Site Certificates, Carnegie Mellon University, 2008 [2] Brian Krebs, The New Face of Phishing, The Washing Post, 13 Feb 2006 [3] Bill Brenner, Phishers' latest hook: SSL certificates, The New Sendmail, 27 Sep 2005 [4] Jack Schofield, Website certificates -- don't go there?, 2007

Thursday, November 27, 2008

Robert Glass' Fundamental Facts: A Reminder

IEEE Computer Society has made public an article entitled Frequently Forgotten Fundamental Facts about Software Engineering [1] by Robert Glass, author of the excellent software engineering book Facts and Fallacies of Software Engineering. Although I believe all software developers could benefit from reading this article, it seems especially relevant to team leaders and managers. I'm sure most seasoned professionals are aware of these facts already, but as the title suggests, it's good to be reminded every now and again. A few juicy tidbits inspired me to rant a little. For those who want to build a great development team, remember that "good programmers are up to 30 times better than mediocre programmers"; moreover, good programmers are far more important to building great software than tools and techniques [1]. I have witnessed those in upper manager who believe that all is needed is a bunch of code monkeys, who can work the longest hours possible with the lowest pay possible coupled with the latest fad in development tools. If you throw enough programmers at the problem you will probably get the job done, but I'm willing to bet the product will be over budget, will be hard or impossible to maintain, will be buggy, and will certainly not satisfy the customer. In other words, it will not meet the common definition of quality software outlined by Glass: portable, reliable, efficient, human engineering, understandable, and modifiable [1]. In the end, the product will cost your company more than if you started with a great team up front. Higher the best and brightest, not the cheapest. Regarding estimation, Glass seems a bit pessimistic, but is so funny because of his brutal truth. In a nutshell, estimates are "done at the wrong time ... (at the beginning of the life cycle ... before the requirements) ... " and " by the wrong people ... (upper management and marketing)", thus "software projects do not meet cost or schedule targets. But everyone is concerned anyway" [1]. Priceless! I generally suggest to clients and/or management that estimates should be made after the requirements are done when the problem is better understood. But as pointed out by Glass, and corroborated by my experience, this rarely occurs. Generally a client wants product X by date Y within budget Z. Madness. References [1] Robert L. Glass, Frequently Forgotten Fundamental Facts about Software Engineering , IEEE Computer Society, 2008 [2] Robert L. Glass, Frequently Forgotten Fundamental Facts about Software Engineering, IEEE Software, vol. 18, no. 3, 2001, pp. 112,110–111 (The original publication)

Tuesday, November 25, 2008

Top 10 Web Application Security Vulnerabilities

If you are developing web applications, and don't know the meaning of and how to prevent the following 10 security threats, OWASP Top 10 is good reading material.
  • Cross Site Scripting (XSS)
  • Injection Flaws
  • Malicious File Execution
  • Insecure Direct Object Reference
  • Cross Site Request Forgery
  • Information Leakage and Improper Error Handling
  • Broken Authentication and Session Management
  • Insecure Cryptographic Storage
  • Insecure Communications
  • Failure to Restrict URL Access

Sunday, November 23, 2008

Steve Yegge's Property List Pattern Summary

Steve Yegge has written an article entitled The Universal Design Pattern in which he describes in detail the Property List Pattern [1]. The key design elements of this pattern (as he describes them) are:
  • It has the basic methods of a Map (using Java's terminology): get, put, has, and remove.
  • Keys are generally strings.
  • It has a pointer to a parent property list so properties can be inherited and overridden. In particular, certain operations, such as get, are applied on a child, but if the property is not found there, it is applied to the parent.
  • Reading a property returns the first value encountered for that property, from the child if it exists, otherwise from its ancestor.
  • Writing (and deleting) a property on the child only changes the property list for that child. When deleting a property that is inherited, it must be flagged in the child as deleted, not actually deleted. Otherwise, all siblings and the parent will have the property deleted.
  • Properties can have meta-properties. Common ones include information governing types and access control (such as "read-only").
Why would you want to use this pattern? Yegge mentions several reasons, including (1) it is very scalable, so it can be applied to single classes or used as part of a larger framework, and (2) it enables extensible systems. Yegge also describes several issues with this pattern, two of which struck home with me. The first is its performance may be unacceptable for some applications, although Yegge describes many optimizations that can be made. The second is that it is subject to data corruption. For example, incorrectly spelling a key and then adding data. References [1] Steve Yegge, The Universal Design Pattern, 2008

Friday, November 21, 2008

Recompiling JSPs During Development in WebSphere/Eclipse

Here's the scenario. You change a public static final variable, say version number, which you are using in your JSPs. You clean the project and rebuild it expecting the new version number to show up in your rendered JSP page, but it still shows the old version number, completely ignoring the change you just made. Or, when you are using JSP includes, the include changes, but the file including it does not. The result: the latter is not recompiled and will not see the former changes. What!? OK, so maybe there is some special command in WebSphere (WS) I need to use --- none that I can find. So what do you do? A Google search revealed one method (suggested on Java Ranch forum [1]): Delete the compiled JSPs from the server's cache. On WS 5.0, this is located in the directory Workspace\.metadata\.plugins\com.ibm.wtp.server.core\tmp0\cache\localhost\server1\EAR\war. However, I'm using WS 7.0 and the cache is no longer located in this directory, at least I couldn't find it. The other method is to manually save ('touch') each JSP file so its modification time changes; if the files are modified, they will be recompiled. Of course, you can open each JSP file, make a change, and save it. I added a "touchjsp" target to my Ant build script to automate this for me. The target is something like the following
<target name="touchjsp">
 <touch>
   <fileset dir="WebContent" includes="**/*.jsp" />
 </touch>
</target>
This can be called as part of the build process, or directly to simply touch the JSP files. Depending on how the target is executed, you may or may not have to refresh the project in WS to detect and compile the files. Having to do this manually in an IDE as mature as Eclipse/WS is ridiculous. At minimum a command should exist to "recompile all JSP files". If there is an easier way to do this than that described above, please enlighten me. References [1] JSP changes does not reflect, Java Range Forum, Jan 2008

Wednesday, November 19, 2008

Flying Saucer, XHTML Rendering, and Local XML Entities

I have been using Flying Saucer's (FS) XHTML Renderer [1] for at least 5 months now. It's an excellent library for rendering XHTML for display in Java Swing (using FS's XHTMLPanel) and for converting XHTML content to PDF files (see Generating PDFs for Fun and Profit with Flying Saucer and iText [2]). In my latest project, I create a report in XHTML and use FS to create a PDF version as follows:

String baseUrl ... // root URL for resources
InputStream xhtml = ... // XHTML content
DocumentBuilder builder =
  DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(xhtml);
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(doc, baseUrl);
OutputStream os = new FileOutputStream("Out.pdf");
renderer.layout();
renderer.createPDF(os);


One benefit of using this method to create a PDF is that it is much easier to layout the report in XHTML than using a PDF library like iText directly. A second benefit is that I can render the report in two different formats (PDF, XHTML) and render it in a GUI with very little extra work.

Everything worked fine until yesterday.

The problem I ran into today was FS (more correctly the libraries it depends on) would randomly fail with errors such as
java.net.SocketException: Connection reset
and
java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd

After help from the very responsive FS users group [3], the problem was that my XML document builder would sometimes fail when accessing the web to resolve XML entities, even though I was connected to the Internet. I was able to fix the problem by adding one line of code to configure the entity resolver to use XML entities on the class path (and conveniently packaged in FS's core-renderer.jar).

String baseUrl ...
InputStream xhtml = ...
DocumentBuilder builder =
  DocumentBuilderFactory.newInstance().newDocumentBuilder();
// Use FS's local cached XML entities so we don't
// have to hit the web.
builder.setEntityResolver(FSEntityResolver.instance());
Document doc = builder.parse(xhtml);
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(doc, baseUrl);
OutputStream os = new FileOutputStream("Out.pdf");
renderer.layout();
renderer.createPDF(os);


The moral of the story is that some thought should always be given to how you configure your XML parsers to resolve entities. In many cases it makes more sense to use a local store, not a default remote site such as www.w3.org. This is especially important if the application doing the parsing may run on computers not connected to the Internet. This applies to FS and other libraries that depend on parsing XML.

References

[1] The Flying Saucer Project, 2008
[2] Joshua Marinacci, Generating PDFs for Fun and Profit with Flying Saucer and iText, 2007
[3] Flying Saucer User Mailing List, 2008

Sunday, November 16, 2008

Math Interval Notation

Recently, while writing some documentation, I needed to use some notation to represent a range of numbers. Given my math background, I naturally fell to using mathematics' notation. And, as happened many times in the past, I could not remember the type of brackets used to represent inclusion and exclusion: was it parentheses or square brackets? It turns out that parentheses ( and ) are used for exclusive end points and square brackets [ and ] are used for inclusive end points. Let a and b be enumerable values (more specifically, they are members of a totally ordered set) such that a < b. Then for endpoints a and b,
(a,b) is all values > a and <> [a,b] is all values >= a and <= b (a,b] is all values > a and <= b [a,b) is all values >= a and < b
References [1] Interval (mathematics), Wikipedia, 2008 [2] Set-builder & Interval Notation, Oswego City School District Regents Exam Prep Center, 2008 [3] Totally Ordered Set, Wolfram MathWorld, 2008

Friday, November 14, 2008

Sansa c240 Portable Detection in Winamp

I had the Sansa c240 MP3 player since last Christmas and always manually dragged and dropped music files onto the device using Windows File Explorer. This can be a painful process when you want to transfer a variety of songs and albums distributed throughout several directories. So I decided to try and manage the transfer of files using Winamp. No problem according to the documentation: plug the device in the USB port, Winamp detects it, and displays its contents in the Portables view. Needless to say, this did not happened. After much research and trail and error, the solution turned out to be rather simple: Plug the device in the USB port. Then, under Preferences -> Plug-ins -> Portables -> Nullsoft USB Device Plug-in -> Configure select the drive letter of the USB device. If necessary, select to unblock the device. I'm not sure why I needed to this, but I'm guessing sometime in the past I must have selected to block my device. Now if I can only get this silly error from occurring every time I insert the player in the USB port. At least everything seems to function correctly after pressing Continue to ignore the error.

Monday, November 10, 2008

DB Record Insertion Rate: A Trivial Experiment

I was recently involved in preliminary investigation for a new project that requires (at least what I thought was) a very high rate of record insertion into a database. In a nutshell, the ability to insert a minimum of 200 records a second up to a maximum of 2000 records a second is required. Even though the records are small, at 80 bytes each, spread across 4 fields, I wasn't confident the DB I had at my disposal would meet these requirements. So I ran a quick experiment. My experiment involved writing a Groovy script that was used to insert 17 million records in a Microsoft SQL Server 2000 DB, both running on my laptop. My laptop has an Intel Core 2 Duo (1.8Ghz) CPU and 3 GB of ram. Each record consisted of 120 bytes, which was split into 5 fields when inserted into the DB. The script inserted the records in batches of 1000 using SQL similar to the following (with 'ValueX' replaced by content to fill out the required 120 bytes) [See Note 1]: INSERT INTO Messages (Col1, Col2, Col3, Col4, Col5 SELECT 'Value1' , 'Value2', 'Value3', 'Value4', 'Value5' UNION ALL SELECT 'Value1' , 'Value2', 'Value3', 'Value4', 'Value5' UNION ALL SELECT 'Value1' , 'Value2', 'Value3', 'Value4', 'Value5' UNION ALL SELECT 'Value1' , 'Value2', 'Value3', 'Value4', 'Value5' .... Results:
  • Approximately 1400 records per second was inserted.
  • The size of DB did not effect insertion rate; i.e. insertion rate did not reduce as the DB got larger.
  • Maximum memory usage was about 2 GB.
  • The DB required about 6GB of disk space (includes index files).
The insertion rate does not meet my upper bound, but is not too bad given the experiment was executed on my underpowered laptop. Note 1 I could not use the more sensible row value constructor syntax [1] INSERT INTO table (column1, [column2, ... ]) VALUES (value1a, [value1b, ...]), (value2a, [value2b, ...]), ... since it is not supported by MS SQL Server 2000. Thank goodness this is supported in SQL Server 2008. References [1] Insert (SQL), Wikipedia

Friday, November 7, 2008

The Why and What of This Blog

Steve Yegge in You Should Write Blogs goes into detail as to why I (and perhaps you) should be blogging. I'm under no illusion that my blog will ever approach the substance, quality, and quantity of Yegge's. That's not my intent. My goals are to:
  • Document new things I have learned, information obtained, and experiences gained in my day to day work as a software developer.
  • Write short summaries (the main take away points) of articles I read.
  • Record anything that is interesting to me, and thus maybe interesting to someone else.
  • Force me to write to practice my writing skills.
  • Allow me to contribute something (hopefully useful) to the software development profession.
  • Give me a chance to rant every now and again, or share my opinion.
It is my hope that writing short blog entries will help me organize my thoughts, help me retain information, and act as a reference source for me (and may be others). The blog will generally be in the form of short notes and summaries, tidbits of software development knowledge if you will. I will not write many essays here. Although I may write off topic sometimes, nearly all blog posts will be related to some aspect of software development. At the present time, I expect the content to be dominated by the more technical aspects of the profession. And if I achieve none of my goals above, at least I have an OpenID account I can use on Stack Overflow