J. Andrew Rogers
andrew at jarbox.org
Thu Oct 10 19:07:48 PDT 2013
On Oct 10, 2013, at 10:22 AM, Stephen D. Williams <sdw at lig.net> wrote:
> Just like embedding SQL in application code is usually a terrible practice, I'm past living with machinery that only understands relational databases or standard data types. Application-specific DAOs may or may not be a clean alternative. Even the form of persisted state and variables can be highly restricting. Do I want to squash all state to strings in a flat map? Not in my most interesting cases. If I want to communicate between modules / tiers and maintain state with Google protocol buffers or JSON or W3C EXI or my own efficient RDA-like interchange (ERI), with deltas, and use graph and document databases along with relational, which framework is best?
> Its fine to have a nice baseline for common cases, but is it separable and replaceable with modest effort while maintaining performance, scalability, maintainability, *ility?
This is only possible if your framework is using a single physical representation of the data model underneath all operations. This allows interfaces to be usefully independent from implementation of that representation. Using a poly-representation implementation, such as secondary indexes or multiple databases, makes real scalability implausible. Operations crossing representation boundaries is a pathological pattern in large-scale systems. Universalizing representation is one of the core problems in massively parallel algorithm design.
For example, there is no reason in principle that relational, graph, and document databases be anything but interfaces on a single database engine with a single data representation i.e. no secondary indexes, since the practical expressiveness is identical. However, your framework can't code its way around the fact that most databases are not designed this way and still expect scalable flexibility.
There are similar representation universality challenges with other parts of the stack. At small scales you can ignore it but it looms large when scaling non-trivial software.
> The overall stack is extremely important and best practices keep evolving. I'm reluctant to use a stack that tightly binds all components if it is difficult to replace or use multiple solutions for key aspects without heroic effort. Working with Rails didn't excite me. Some Java stacks look nice, but the endless layering and intertwining of many libraries also makes me uncomfortable.
Tight coupling is a natural side-effect of designing systems to minimize data motion in order to maximum performance and scalability. Loosely bound components make replacement much easier but necessarily create far more data motion. It is a tradeoff.
If you are not moving data to the code then you need to move the code to the data. Have you considered what that would need to look like in a software system? If you take the design goal of having a single representation of the data and minimizing data motion to its logical conclusion then the system asymptotically converges on being structurally monolithic even in complex, massively distributed software systems.
I would make the point that swapping out lots of components is necessitated primarily by lack of universality of the underlying representation implementation.
More information about the FoRK