Friday, 16 January 2015

Microservices and the future in business enterprise software development

Business processing has evolved from being
  • Done by humans, to
  • Hardware, to
  • Machine code, to
  • Compiled, to
  • Intermediately compiled, to
  • Scripted 

At each step of evolution these systems becomes more and more interconnected.  Isolated mainframes evolve into micro-services that run on mobiles and the cloud. 

As the cost of implementing each component falls so does the need to invest in their planning and design, until you almost do not want to do any at all.  This ‘suck it and see’ approach, where the implementation is the prototype, and is eventually the published design, often gives the best results as experience is a valuable form of design feedback.  This reductive agile process is like chipping away at a piece of marble in comparison to an additive one when working with clay. 

This approach interestingly was done with great success by Russian rocket scientists with a fraction of the budget of the Americans.  E.g. plans for an efficient N1 rocket motor from the design bureau, only worked when the build engineers played with the designs, once perfected the design was sent back to the design bureau.  Although the rocket never went to the moon, the motors were successfully used decades later by the Americans.

This counter-informative way of doing things motivates developers as they can just get on with doing what they love: turning requirements into living code.  It results in smaller teams and depends less on software architects and project managers, in exchange for a more direct dialogue with the owner themselves.

Having said that if anything went then there would be anarchy, so some simple standards are required.   Code quality, agreed methods/interfaces for communicating between components, code reuse of standard sub components e.g. logging, database connectivity, email etc.  The number of rules need to be simple, clear and wherever possible automatically checked.   Scientist have found out that fish follow very simple rules to work effectively in a shoal, the same goes for micro-services.

The language that best suits this is JavaScript/Node, it is portable, lightweight and extensively supported.   However it being a bit of a rush job when first implemented, it misses many features that are considered best practice such as type safety.  Also as it is such a dynamic language key components can morph unexpectedly to express different behaviour when a new dependency is injected into an application.  However these issues are not insurmountable and the benefits easily outweigh them.  Also some of the weaknesses have changed the behaviour of developers considerably by actively encouraging them to use BDD/TDD unit testing at a much earlier stage in development that is usual in other languages, which helps in developing a higher quality product in the end.

The best thing of all is that it is not owned by any big corporation, which is allowing a very healthy ecosystem.  They have not come in and injected enterprise components that rot the heart of the language in the way that has happened to Java.  There are signs of a fight for the hearts and minds between mainly Microsoft and Google over type safe implementations of JavaScript, which will result in fragmentation for a while, however, in the long term will benefit all concerned.

Where does it leave legacy languages such as C++, Java and C#?  They will never go away, but will retreat into their particular niches.  In CloudMargin we use C++ for a database abstraction/caching/security/event service layer, due to its nature it needs to be multithreaded and is treated and developed more as a system component.  C# come as output from MapForce for doing data mappings from and to our client file formats.  Both the C++ and C# components have no business level functionality such as the JavaScript micro-services do, and do not take part in the same development lifecycles, they are valuable components, but do not take part in normal day to day business enhancements.


Software development undergoes waves of what is fashionable and certainly what is coming out now is a breath of fresh air.  However what is popular now will be as trendy as my father’s flares in a few years’ time.  Or come full circle… perhaps I should be thinking of investing in some bell bottoms!

Wednesday, 8 January 2014

New Venture CloudMargin

I'm getting involved with a new startup; CloudMargin provides a SaaS based collateral management service for hedge funds and corporate tresurary desks.
The frontend is Sencha ExtJs the backend is the C++ Clarinet data caching engine that I have been working on over the years.  Clarinet provides a lot more than we need such as web push; however as it provides client specific views of data its ideal.  Bindings with the front end are complete, the schema for the database is functionally complete for MVP.  The management screens are almost there too.
More to follow...

Wednesday, 31 July 2013

Writing high performance web applications with server side sessions

If you are writing a web server that requires server side session management there are a number of issues that can kill performance or blow up in complexity, sometimes literally thinking out of the box can solve these issues.
Typically with applications that require server session state you need a way of finding a session object created in a previous call.  The classic way of doing this is to have a concurrent or mutex protected hash map to find the session object, easy peasy.  However you will find that you will get your knickers in a twist when you start having to have garbage collection, typically you will need a number of timeouts that get triggered between calls and hertbeats etc that quickly complicate things.
Sure it is possible to use reference counted smart pointers to protect these objects, but these timeouts need cancelling, it can get messy very quickly especially when you need to start managing the mutexes concurrently involved on the session lookup and on the session object, without hitting deadlocks.
Even if you have the time to solve these problems, if you are on a busy server with lots of cores running these locks can start to seriously affect the throughput possible. Amdahl's law starts rearing its ugly head.
However do you need all your threads to have access to the same session object? Could we somehow restrict access to a session to a partcular thread?  If so we could ditch the locking around the session and its lookup.  But how?
The solution that I have come up is surprisingly simple and it really thinks outside the box and into the router.  The setup that I use is this:
  • You set up a series of internal ports and assign each port to a dedicated thread, that thread then uses that port asynchronously so that it does not block between calls to IO.  This way it can run at 100% utilising that core to its fullest extent.
  • Also it is recommended that you give each thread its own dedicated core that it runs at a high priority, thereby maximising performance of the caching on the cores.
  • On initial connection the router distributes a client to one of the ports and a session id is assigned. That session id then used by the router to send subsequant request back to the same port.
By having this we have a shared nothing architecture between sessions, which is great.  Your sessions do not need to see each other, so having a shared session pool makes no sense.  Your code is much cleaner and you will find that the complexities around locking and garbage collection disapear.  You no longer need mutexes and smart pointers no longer need to use memory barriers all is good and all is very fast!

Tuesday, 27 March 2012

Squeezing the most performance out of your site

When writing a web page, or full blown application there are a few things that can turn your site from a greyhound into a plain old dog.
This biggest issue that governs the speed of a web site other than the speed or your server is the speed of the internet connection.  The speed of light, at 186,000 miles a second (300,000 km/s) is fast, however for a photon to travel 1000 miles from a client to a server and back, will take around 10ms.  The internet connections that you use though have to also contend with ADSL routers, packet contention, and packet shaping protocols which smooth the flow of data between nodes.  A typical network connection may consist of ten or twenty router hops between client and server and back again.  On top of that you have the tcp protocol that guarantees that the data has not been corrupted in transit.  And if you want SSL, you will need to double latency again when making the initial connection as the two sides set up a secure channel for data to flow.
That 10 ms second request from client to server could be as bad as 50ms... big deal!  Until you look at how your web application is built.

Client side optimisation

A typical web application may have upto 30 icons or spites, two or three javascript pages a css page or two to render properly (if you do not believe me look at extjs, ckeditor etc), in all you could be hitting the server with forty requests of information.  And that does not include the processing time to generate the requests from the client, interpret and respond the request on the server and the rendering effort that the web client needs to do.  All told you could find that your beautiful web application intended to take over the world takes 10 seconds to load by which time the potential client has moved onto your rivals rather ugly but snappier web site.  To mitigate this you can:
  • Combine all your sprites etc into one file, this reduces the number of network round trips between the client and the server, there is a great tool out there called SpriteMe which uses your browser's DOM to scan the used graphics referred to by the css, and combines them into one bitmap along with a list of instructions that you need to take to modify you CSS to read the right portion of the bitmap for each sprite.
  • Combine all your javascript files into one, concatenating them together.  Even files from different vendors.
  • Minify your javascript.  There are great tools for doing just that, there is the Google closure compiler, that loads your javascript into a virtual machine and spits it out without any padding and dead code, a great tool.
  • If it makes sense you can merge the css and javascript into the html file.  Note though that the css and javascript may be used by more than one page, and that they can be cached by the browser, which in this case would make sense to be stored separately from the html
  • Compress your html, javascript and css, this will results in fewer network packets being sent back and forth being sent per file, there are a lot of tools that can do this just search for them.  
If you do all of this you will see that the data is transferred is less and more importantly the number of requests to the server will be slashed from 40 to less than 5, which should mean that your web site loads in a tenth of a time that it did before.

Protocol optimisation

The HTTP protocol back in the early 90's was great in that it was an efficient way of getting static unstylised html from a glorified file server to an often text based web browser on a green screen.  The way it is used now is way beyond what it was initially designed for, and is creaking a bit in terms of efficiency.
The HTTP is a stateless half duplex protocol always initiated by the client, that does not remember anything about what happened in the past.  To manage state and implement web push we have to jump through all sorts of hoops to make it do what we want.  It is also can be incredibly complicated, with a morass of standards that use a mass of different data formats, which include uuencoding, base64, SHA and different text format including Unicode.  I personally think that it is no longer really fit for purpose and I will personally welcome to some rationality being injected into any new standard.
There are however some tweeks and major enhancements that change the game:
  • Keep alive, hopefully your server has got this one switched on, it is a set of http headers that tells the client and server to reuse the connection for the next request from the client.  This is a major performance enhancement, especially for SSL as recreating connections add latency to a new request.  It does have the downside in that it keeps a network port open which could be a limiting factor when there may be thousands of clients trying to connect to the server.
  • Expires or a Cache-Control Header, these headers tell your browser how long the file can be kept in cache before it needs to be reloaded.  The great thing about this is if the browser knows that it does not need to load a file but get it from its own cache then it will do so saving a lot of network latency,
  • Web Sockets, this is the biggie, it is new and so old browsers do not know about this one, you also may have problems with certain firewalls that do not recognise the protocol.  However if they do, you get essentially a raw socket between the server and the client, where each party can communicate to each other without waiting on the other or on any reply from the other.  This is a true revelation in that you can stream requests to the server and get streams of data coming back, along with any server originated event information.  The advantage to this is that while you are waiting for a reply from the server you can be busy sending requests for other files down the same channel all the bitmap concatenation effort that is mentioned before is less important as latency is no longer preventing you from making further requests, thereby giving you the same performance as you would by concatenating data in anticipation of such a request.  Also Web Sockets gives you the advantage to get events from the sever as soon as they arrive without the client having to poll for that information, thereby reducing the load on the server and delay times.
    It is all very well having this newfangled Web Sockets but how do I use it?  There are numerous libraries that specialise in this that can work as extensions to existing libraries including php, node.js, asp.net and C++.  These tools vary though in their maturity and the WebSockets spec is still being tweeked.

Serverside optimisation

This is where things get more interesting.  Developers are torn between performance and scalability, I often think that too much worry about the latter affects the former, when there was no need to be worried in the first place!  In early web applications there was clearly too much data on the hard drive to be possibly loaded into memory and served to clients.  As hard drives are much slower than memory, you often needed a cluster of servers to handle the traffic to the site.  However the content of most web sites nowadays can easily fit into the memory of a well endowed server.  Servers are now much faster and have many cores to allow a lot of work to be done concurrently.
The problem with conventional stateless architectures is that every database object loaded into memory after a web request is made is thrown away.  This increases latency as the serverside logic needs to wait for the database to do its work for every request.
Having a caching middle layer only uses the database to persist objects, all (or the most important) objects are stored in memory, and retrieved from memory using some form of lookup grammar.  With this sever side performance can increase tenfold, and can halve the perceived client latency, and server resources are used more efficiently, as there is less internal network traffic and the database is less busy, doing the important stuff such as persistence and locking.
You can also improve the performance of http and ajax request is to do mulitpart requests and responses.  Due to the half duplex nature of HTTP it is best to package all your requests at once rather than one after another as you may find your application blocks more often that you would like.  If however you are using WebSockets then forget it, this optimisation makes no difference as it is fully asynchronous, so you can stream your requests without waiting for the replies.
Look at content delivery networks, these cache your data more locally to the client than your central site does, if your packets travel only 100 miles rather than 1000, you will get quite considerable performance gains for poorly optimised websites.
Well worth reading: http://developer.yahoo.com/performance/rules.html

Tuesday, 21 February 2012

Hackathon @ the Guardian

I have to say that I like the Guardian's offices they are really nice.
I went to attend a Hackerthon there last Saturday (18 Feb 2012) with my daughter 14 (artwork and business consultant) and Vlad 15? (php and javascript) a friend of the family.  After a brief intro from a number of different API vendors we started hacking together a solution for people finding events in conferences.  Our team also included:
Steven De Salas
Munthir Ghaleb
Praveen E
Pooya Yavari
I had a howling cold which did not help things much.  After five hours our components that we had made looked beautiful, but we had run out of time getting them to talk to each other and so we had little to show.  The other teams made less ambitious components and delivered something to show to people.
We really enjoyed the teamwork and the taking part was as important as the winning (yes I am English).  It also told us a good lesson in timemanagement!

Tuesday, 10 January 2012

Non blocking C++ Ajax bridge code generator

I have come to the conclusion that writing Ajax web services in C++ is really painful.  First you need to know the http protocol, C++ and Javascript, secondly you need to design your own communication protocol on top of that marshals json messages to and from the web browser, third you need to design it so that it does not gobble up tonnes of threading resources, forth you need to make it secure and you need to make it fast.
There are tools for this GSOAP is a great library for communicating via soap but not to web clients, yes you can communicate between SOAP and javascript clients, but SOAP is so unfriendly and inefficient there should be something better.  You can use the open source json libraries and make your own http stack.  A great library to use is the ASIO library for communicating in a non blocking way, but there is no code generator to work with it.
I have therefore written my own code generator that parses C++ headers and produces the necessary C++ and javascript over the ASIO library.  It is fast and non blocking, so that it will process network packets as they arrive and then return to the event queue to wait for the next network packet to complete the message.
My next step is to make it work over web sockets transparently, degrading back to long polling for clients to be able to recieve events generated by the server.
If there is anyone who wants to try out a beta version of this generator then please let me know.

Friday, 15 April 2011

Function profilers

Hving written your magnus opus that does everything as well as making tea for the head trader of investment bank XYZ you get a complaint that the code runs as dog, and he wants it fixed or the project is binned.  It can though be a nightmare in finding those areas of code that could provide that factor of 10 improvement that you are looking for.  Well it is time to get out your profiler.  The ones that I have used have been extremly complicated use and so I wrote my own.
#include "function_profiler.h"
 
 #define USE_PROFILING
 
 int main()
 {
 #ifdef USE_PROFILING
   xt::stack_time stack_timer;
   PROFILE_FUNCTION();
 #endif
 
 //Scatter these throughout your recursive functions
 PROFILE_FUNCTION();
 do_something_interesting()
 
  {
   PROFILE_FUNCTION1("load_data");
   do_something_even_more_interesting()
  }
 //If you want more than one in a function put them into their own stack space by wrapping them in "{}"s.
 
 #ifdef USE_PROFILING
   stack_timer.print_to_file("Load.xml");
 #endif
 }
Done!  You will now see an xml file that you can load in your browser, and see a structured log of how long each function took to process and a breakdown of each child function.  The code is thread safe and you can have one profiler per thread, just make sure that you give the outputted files different names!
Here is the code core.zip (40.83 kb) there are some other things there which you can ignore or use as you like.  Some of my projects require Roguewave, some can have Boost, this library contains an abstraction layer to support either, e.g. my string class is a std::basic_string<TCHAR> with boost and RWCString in Roguewave.  This includes support for threads, and includes ZThread as well.  And a universal logger which I will explain in a later blog entry.