“In learning you will teach, and in teaching you will learn.” : April 2015

Thursday, 9 April 2015

Software Architecture - Examples

On the article Software Architecture I said that I usually start the architecture of applications by thinking only on how to solve a problem using the idea of a technology, not considering any limitations of actual implementations. Only later I try to chose a technology and in many cases decide to write my own because the existing ones don't accomodate my needs.

I know it is hard to get that idea without examples, so in this post I will try to show application designs ranging from the complete lack of architecture to an architecture bound to a technology and also an architecture that I consider unbound to technologies and their limitations.

Fictional Purpose

Create a simple web application that lists categories (with any level of sub-categories) and products. Such an application will not edit anything and there's no e-commerce at all. The data will be entered using another application and the sells will be done by phone or by some other means.

It is actually planned that on the future there will be an editor application and even e-commerce on the site but that depends on the success of the actual code. So, it is not needed to write those other applications for now but it is good to be prepared to receive them.

No Architecture - "Being Too Agile"

I don't want to criticize the Agile methodology as I think it has many valid points. Unfortunately, many developers justify the completely lack of architecture as being "Agile", and that's why I am using that "Being Too Agile" on this topic.

This "too agile" may happen because there's no discussion at all and developers simply start doing things or because there are discussions focused on the wrong things before starting, maybe more focused on how to name things (like private members and database tables) instead of really focusing on code reuse and the evolution of the application. That is:

A database is immediately created using the database the team is most used to (for example, SQL Server). Two tables are created (Category and Product, independently if they have a different name by their standards). At this moment, the team is minimalist, so there's only the Id, IdParentCategory (that can be null) and the Name for the Categories and Id, IdCategory, Name and Description for products. No other tables or columns. Obviously, some fake data is filled;
The web application is created with a single form. To "reuse" code, a DBHelper class (which is completely bound to SQL Server and has only two methods: ExecuteNonQuery and GetDataTable) is copied from another project and the actual connection string, which happens to be hard-coded in it, is changed to the new database. In the main form there's a tree view for the categories and list box for the products. The base categories are loaded on the constructor (using the DBHelper class to return a DataTable and then iterating it to populate the tree, in the format "Id - Name") and when clicking in any category, the category Id is extracted from the item's text and used to generate the query of products and sub-categories, by concatenating the id to the base queries (hard-coded on the events).

As you can imagine, all the code is in the form's code behind and the only class added to the project is the DBHelper class.

To me, this is the complete lack of architecture but some people (in particular managers that took only one or two programming lessons but believe they know much more than they actually do) can see only good points, like:

Everything that happens on the form can be found by looking at the form's code behind file. There's no need to navigate many "layers of files", there's no complexity to find the implementation of an interface or anything like that;
As the DBHelper is built for a specific database, it avoids the "slow" virtual calls;
Changing the code to use another database is possible, it is enough to change the DBHelper class source code;
The application is up and running pretty fast;
Junior developers can maintain the application.

The bad points? Probably none to someone who completely agrees with all the "good points" I just presented. If that's not the case, then there are lots of problems, but I believe they will become obvious when reading the rest of this post.

Code Reuse... Or Not

The customer sees the application. Obviously he requests changes to the UI, layout etc but I will ignore it here. The application is running and there's no request to change its code or architecture.

Yet, the customer thinks that it would be great to have a native application too, so things can run faster and outside the browser. Let's not discuss if browsers are faster now or that the performance difference will not be noticed. Let's simply accept that a native application will be created.

Can we reuse any of the existing code?

And if we don't consider copying the DBHelper to a new project or copying the hard coded queries from the current web application to the new application, then there's nothing to reuse. It's like we can use the current project as the "inspiration" for the new application by looking at its code to take the parts we want but there's no direct reuse, like importing a library or something.

Considering this is a really small application, it would be OK to do a copy, but now is the moment that the future changes are probably taken into account, and it seems better to change only one place than change two places every time. One of the first considered changes is "what if in the future we should not show some categories (like empty ones) or products (that have a deleted flag, an expiration date or similar)?"

As the developers don't see any way to reuse the UI, they want to reuse the database queries and they think about these two solutions:

Create a Queries unit that will contain all the queries of the application, and the same file will be shared by both applications (not copied);
Create Stored Procedures to do the job in the database.

And I can tell that in most cases the second solution will be used. There are actually many arguments to go in that direction:

Changes to the database will not require change to the applications or recompilation of the applications;
Stored Procedures are stored in an optimized way inside the database and some even further optimize themselves according to the use, so they are definitely faster than executing different queries from the code;
It is said that stored procedures can avoid SQL Injection because Stored Procedures are parameterized, but this is a half-truth (the procedure itself is parameterized, but code that concatenates strings to do the EXEC PROC is still suscetible to SQL Injection);
Independently of the half truth, it is true that direct access to the tables can be forbidden, so a DBA can protect the database by forbidding direct access to tables, avoiding deletes and updates to happen if there are no procedures for such actions and also forbidding a query to take too long because a WHERE clause was not used in a SELECT;
There's a standard on the code: To call anything on the database there's an EXEC PROC followed by a procedure name and all the parameter values it needs.

So, this is the direction the team takes.

In this case, it happened early in time as the application only has 3 queries. That means that 3 stored procedures are created and the web application is changed to use the stored procedures instead of doing the direct SELECT commands. It would be terrible if this change happened after having 50+ queries.

For now, consider that the database parameters are not used, so the code is suscetible to SQL Injection but there's nothing an SQL Injection can do to corrupt the database at this moment, as only the stored procedures are accessible as read-only and all the data is public.

With this database change, it is now possible to create the native application and "reuse some code". Even the "Id - Name" formatting for the category is done by the stored procedure now and the code of the application doesn't extract the ID anymore (I will not even explore what will happen if the ID can't be displayed). The stored procedures receive the entire Category text and extract the ID. The developers need to recreate the UI, but the "real logic" of the queries is reused. The only code on the application is to add the tree view item's Text at the end of an EXEC PROC call and to read the results and create new tree view items or populate the list box.

Connectivity Problem

To be honest, most DBAs would never allow the database to be externally visible, but let's say the team is not really working with a DBA. They are simply "solving problems fast" and the fastest thing to do was to allow the native application to directly connect to the database.

It worked on some tests but there are two main problems when used in production:

It doesn't work if a proxy server is required to connect to the internet;
If too many native applications are connected, there's an excessive load on the database server, even if most of the connections are inactive.

To the first problem there's no easy solution, as the database connections simply can't pass through a proxy server.

The root cause of the second problem is that the connection string used by the DBHelper class is using a connection pool. It doesn't matter that the pool only has one connection. Every native application is keeping one connection alive. The connection pool is great on the web application, as the same connections are used independently on the client, but they are terrible on the native application that runs on the clients computers.

To solve this issue the connection pool is disabled for the native application. As the DBHelper is a copy, the change doesn't affect the web application. But now the native application becomes slow... in many cases much slower than using the web application as it loses more time connecting and disconnecting from the database than reading some data.

Web Service

It seems that the only valid solution to keep the database server free and to pass through proxies is to use a web service. The web service can actually live in the same server of the web application, sharing the connection pool.

So, the decision is to create a web service to represent all the methods they are using from the DBHelper:

Hide Copy Code

interface IWebService
{
  string GetDataTableAsXml(string commandText);
}

Yeah, they are using only one method. A web-service like this is clearly not how web-services are supposed to work but this is a work-around to allow the code to continue using a DBHelper class and doing those EXEC PROC calls.

So, the actual job is to create a web-service that executes the received command and converts the data-table to string, and change the native application's DBHelper class to use the service and convert the string result back to a DataTable. The application itself will continue to use the DBHelper class and use the same SQLs, so this is the smallest change possible for the application right now.

It works. Many developers would be afraid to deal with that kind of "architecture" (or will try to kill someone) but it works. Bad things will happen if we can't show the ID on the category, but that's not a requirement right now.

Starting Differently

If you think that the previous solution was terrible, well, I completely agree with you. Yet it shows a problem that happens frequently:

An application is created with a bad or inexisting architecture;
All the changes that come later are required to be minimal (usually by time constraints or because the "architects" don't accept their "architecture" is completely flawed), being more work-arounds than a design/architecture change.

But what would happen if they started differently? What would happen if, since the beginning, it was said that a stateless web-service was required, capable of listing categories, sub-categories and products?

I can see that a web-service interface like this would be written:

Hide Copy Code

interface IWebService
{
  // Gets all the categories from the given path.
  // To get the base categories, pass null or an empty string.
  // Returns only the sub-category names. 
  // To create the full path, use the previous path + "/" + categoryName.
  string[] GetCategories(string path);

  ProductInfo[] GetProducts(string path);
}

And the ProductInfo will have all the info for the product that's not already present on the call. That is, it will not have the category path, as it comes on the request, but it will have Name, Description and any new column that may be necessary to present the product.

I am not going too far with the changes, so keep the idea of that DBHelper class and string concatenation when dealing with the database.

The applications

In this case, the Web Application can either use the web-service as an external service or it can invoke the web-service implementation directly, after all they are on the same server.

The client application would use the service from the start and the problems related to direct database access would never exist.

Considering there's only the service accessing the database, it is possible that the stored procedures are never created (remember that the stored procedures were created in the other situation to share some code and in this case the service is shared). So, all the queries can be part of the web-service itself.

Of course there's a big difference from before: There's no IDs being send from the client to the server anymore. The paths are actually created combining the selected category with all its parents (code that is probably going to be copied in both applications, as it is not that big). This also means that the service will probably split the path to find each category ID, executing much more queries when a 10th level category is used.

If such navigation of the categories becomes a problem, the future optimizations will probably include:

Caching results on the service. At least while the database is read-only, any kind of cache that avoids new database calls would be great;
Creating an extra table with the full paths and the category ID. So, it would be possible to do a single query for products or sub-categories with an inner join to that new table, which can use an equals comparison for the full path;
Simply putting the full path as a new field in the product and category tables. This is probably the solution that uses most database space, but it avoids a new table and the joins.

The shared problem with the first case is that by using that DBHelper and string concatenation it is still possible to suffer SQL Injection.

As my purpose is not to discuss about SQL Injection but different architectures, I want you to think about this:

Can you see how the data related stuff was written in a completely different manner simply because we started with a stateless web-service?

I am not saying the applications are much better now. They still have the events written in the code behind, using foreach over the results to create new tree view items, using the data from the UI components to get the category paths and completely ignoring design patterns like MVVM or MVC.

Yet, the decision on how to write the database is different. The queries executed are probably going to be different. And the applications don't have any database query anymore (even if EXEC PROC doesn't show what happens internally, it is a database query). Actually, it is even possible to write a web-service that never access a database, using a XML file or a completely different thing and the applications don't need to change.

The Basic Object-Oriended Approach

Both solutions presented until now are completely different from the most basic object-oriented (OO) approach.

Anyone who knows the basic from object-oriented programming will naturally think about two classes: Category and Product. I am not talking about database tables. I am talking about classes.

The Category would probably have these members:

Hide Copy Code

static SomeCollectionType<Category> BaseCategories { get; }
static Category GetBaseCategoryByName(string name);

Category Parent { get; } // Can be null
string Name { get; } // Can't be null

SomeCollectionType<Category> SubCategories { get; }
SomeCollectionType<Product> Products { get; }
// I will soon discuss the SomeCollectionType

Category GetSubCategoryByName(string name);
Product GetProductByName(string name);

And the Product will probably have these properties:

Hide Copy Code

Category Category { get; }
string Name { get; }
string Description { get; }
// Any other property that seems necessary, like Price, Picture etc.

Thinking about classes to represent Category and Product but also forgetting about a better architecture, developers could simply implement the SubCategories and the Products properties to call the DBHelper.GetDataTable method and make these classes completely bound to the database. There's a big chance they will also expose an Id property.

Also, the use of real properties for the collections or methods like GetSubCategories() and GetProducts() and the result types of array, IEnumerable or something different will greatly depend on how well developers know the Object Oriented principles, how much they will tie the result types with the actual implementation and also how well they follow the .NET guidelines.

The use of methods is recommented to let it clear that the action may take time, but this is kind of binding it to an implementation (what if all the items are in memory already?);
The use of properties is the most common way to represent child items in most cases, but it lacks the information that a slow database call may happen;
I am not covering it in this post, but maybe a Task<SomeCollectionType<T>> should be used as the result type to support asynchronous implementations;
Some developers may return lists simply because when reading the database they don't know how many records are there and putting things to a list is their "natural" choice. Maybe they don't know any good practice telling not to return modifiable collections, maybe they don't care as a new list is created each time, so there's no problem if the receiver changes the result;
Some developers may return arrays because they are following some pattern like the Reflection methods. People can insert new items to a list, not to an array (but they forget they aren't read-only and items can be replaced);
Other developers will use a ReadOnlyCollection because they know the result is not expected to be modified;
And others will return IEnumerable either because they learned it is the right thing to do, either because they want to use yield return and avoid pre-loading all the records.

No ORM

Notice that here I am not talking about using an ORM. An ORM would probably generate a similar set of classes, but that would be required by the ORM to map to the database and that's not my focus. I am simply talking about the basic OO idea of having objects that can contain data and behavior.

It would be possible to put these 2 classes into a shared DLL and end-up doing a web application and a native application that are pretty similar to the first case. That is, the native application would still have direct access to the database by using a DLL that has direct access to the database.

The biggest difference is that we will always see Category and Product as objects. Then the objects will do the database job. This would probably change how things are put into the Tree View, so either the objects are put into the Tree View directly and a very base data-template is used to show the Name, or they are somehow stored in a property like Tag to be accessed later without extracting paths or ids from the tree view items.

It would also be possible to have a case similar to the second one, creating a web-service on top of these classes and writing the applications on top of the service. But then the OO approach will very likely only exist inside the service. For the two applications things will stay as a stateless service, not as an Object-Oriented approach.

Object Oriented without Limitations

Wouldn't it be better if things started with the object oriented approach, allowing both applications to deal with Category and Product objects, but without the problems?

That is, the web application could use the objects directly and those objects will access the database. The native application will also be coded as if it was using those objects directly but communicating to a web-service.

So, is this possible?

And the answer is yes. And this is what I mean when I talk about creating an application architecture without considering limitations. I don't need to think on stateless objects. Being stateless is a communication limitation, not part of the application architecture.

So, the only "constraint" I use is that to support all the different scenarios I must start things with interfaces, not classes.

That is, I will have an IProduct and an ICategory. These interfaces could look like this:

Hide Copy Code

public interface ICategory
{
  ICategory Parent { get; }
  string Name { get; }

  IEnumerable SubCategories { get; }
  IEnumerable Products { get; }

  ICategory GetSubCategoryByName(string name);
  IProduct GetProductByName(string name);
}
public interface IProduct
{
  ICategory Category { get; }
  string Name { get; }
  string Description { get; }
  // Any other property appropriate for a product here.
}

As we can't have static methods on interface, we will need an "entry point" to get the base categories. This can be another interface:

Hide Copy Code

public interface IBaseCategories
{
  IEnumerable Categories { get; }

  ICategory GetCategoryByName(string name);
}

Having these interfaces we could write the two applications without any dependency on specific implementations. Those interfaces can be implemented in completely different manners, so we can have implementations that load things from the database, that load things from XML, by using a service and why not a test implementation that simply instantiates two categories with two products each directly? This is what will probably happen when testing the applications for the first time. Instead of dealing with an actual database connection, we simply make things work on top of the interface with a fake implementation.

Of course, at some moment we will need to write an implementation that uses the database (which can have Id properties too) and we could end-up making the bad choice of using that implementation on the native application. But replacing that implementation with one that uses a web-service will only require a one-line change to instantiate a different IBaseCategories implementation (and if done really correctly could avoid recompiling the application, but I am ignoring that part of the architecture for now).

So, up to this moment, the most basic architecture, that I consider to be free of implementation problems is:

Think about all the classes you need, like you will do when writing an UML diagram and, to guarantee that you are not bound to any implementation detail, write everything as interfaces;
Actually, there's no item two. As long as the interfaces represent the right behavior and relationships between the objects, everything is OK at the architecture level. The applications can already be written on top of the interfaces.

Moving problems forward?

Having the interfaces first means that we can have any implementation. But we will need to write the implementations, right? And those implementations can have all sort of problems like they had before. So, aren't we simply moving the problems forward?

The answer to this question is something between yes and no. If we blindly implement the interfaces we can surely have an implementation with all the problems we had in the other scenarios and to fix them we may need to completely rewrite the it. Yet such a "complete rewrite" of the implementation doesn't require a change to the applications.

In fact, it is even possible to request that another team writes the implementation of these interfaces and they aren't required to have the applications at all as long as they have the interfaces, so it is guaranteed that the rework is going to be smaller.

Stateful and Stateless - The False Assumption

"We can replace implementations but we can't use a stateless web-service. These interfaces are stateful. Everybody knows that stateful services don't scale well."

This is probably the killer argument to avoid starting things with the more Object Oriented interfaces. And this is actually false argument.

It is true that if we use the default .NET remoting the products and categories referenced by a client need to be kept alive on the server for as long the the client can use them or else new requests to use those objects will fail. Worse than that, it is the same computer that must answer the requests as all the "object ids" known by the client are server's in-memory object ids.

If we use WCF we simply can't expose these interfaces directly as services because they aren't stateless.

Yet, these are framework-specific limitations. Instead of using in-memory IDs the framework could very well be sending the actual database IDs or even the paths as the information, being capable of reloading the objects if needed and also allowing different servers to answer new requests.

Want a proof of this?

It is possible to create a stateless web-service and implement it to call these "stateful interfaces". It is possible to create an implementation of the object oriented (stateful) interfaces that store the paths and use that stateless service for the calls.

That is, the application can be using the real implementation directly or it can using objects that hold paths and redirect the requests to a stateless webservice, with in turn is implemented to use the stateful objects to do the job.

"OK, it works but it is a lot more work to make it run properly and all the extra work makes it a bad architecture."

Yes, it is a lot of work... if it is done by hand. A better framework will do that transparently or, event better, will simply work differently and avoid the problems altogether.

My purpose here is not to say that we should avoid stateless services or that we must always lose time creating our own frameworks. My purpose is to show the difference between applications written without any architecture in mind, applications written with an architecture in mind, but having the architecture shaped by specific technology/framework limitations and applications that have the architecture made before considering technology limitations.

Risks

An architecture unbound from technology limitations is not always a good thing. Some limitations can't be avoided and others would require so much effort to overcome that it is better to accept them. I can say that at this moment the biggest limitation I see when designing any interface is the sync/async dilemma. An interface shouldn't expose implementation details but being synchronous or asynchronous is an implementation detail that affects the signature of the methods.

To me, that's the kind of technology limitation that we must consider when doing the architecture. If we consider only the most versatible design, without considering performance, it is probably better to make all interface signatures asynchronous as it will support all the cases. It is always possible to give a synchronous result when implementing a method with an asynchronous signature. The opposite is not always possible.

Yet, making absolutely all interfaces asynchronous is a performance killer. So, it becomes a matter of choice and the expected uses/implementations of the interfaces and, in some cases, it is even valid to have both synchronous and asynchronous methods that do the same. I hope this kind of problem disappear in the future.

MVVM, MVC, ORMs and everything else

Up to this moment I didn't solve many problems. In all scenarios there's code directly on the form's code behind and I clearly left all the implementations that access the database susceptible to SQL injection. So, I am doing a terrible architecture, don't you think?

To be honest, I left all those details untouched on purpose. At this moment we have three main areas: UI, the job to be done and the abstraction that let the other two talk. In some sense we can say that this is a kind of MVVM or MVC, but it is not exactly the same. In MVC and MVVM, all the layers are implementation layers. There's no real abstraction.

In the latest solution I consider the UI and the job to be done as the implementation and the abstraction (the interfaces) as the real architecture. That is, it doesn't matter if you use MVVM correctly. If your Model is bound to SQL Server and nothing else, it would be a problem to make things work through a web-service or similar. If you have the right abstraction first, then it is pretty easy to do that change.

Yet, it doesn't mean that you should avoid MVVM or using an ORM. In the end we always need an working application (or two, as the web application is not the native application), and having a good architecture is only the start. When going to implement things, if an ORM will help the team write easier to read queries and avoid SQL Injection, they should go for it. If the code behind is a problem because the designers don't know what to do with it, then go for MVVM. Only remember that those are part of the implementation. Maybe you can consider them sub-architectures as they will greatly influence the code that's going to be written, but the main architecture is built on the purpose of the application. MVVM, MVC and ORMs exist independently of the applications and should not be considered the architecture on their own.

Ref From

http://www.codeproject.com/Articles/889468/Software-Architecture-Examples

Software Architecture Introduction

I work as a Software Architect/Systems Architect and many times when I do job interviews it seems that people simply have no clue about what I do. I can't blame them as such terms have many meanings and if you look in wikipedia links Software Architect and Systems Architect you will find that some of them seem to be completely different tasks.

In fact, the entire problem lies on the fact that almost any decision made before actually writing some code may be seen as architecture. If I decide to create a game, deciding which kind of game will be created is already a decision of architecture. It is not software architecture by itself, yet such an initial decision will affect the programming that will be done later, as different kinds of game require different kinds of decisions.

But we usually start to talk about software architecture when we start to choose the technologies to be used. That is, will we use XNA? Will we use C#? Will we use Javascript?

If we decide it will be C# but not XNA, will it be Windows Forms? WPF? Silverlight?

And, if it is a multi-player game, will we use pure TCP/IP (or UDP) writing all the communication layers/details or will we use a high-level framework like WCF?

This is a very important decision time, as the entire evolution of the application may go better or worse by those initial decisions. Yet, except in the situation that we decide to write the entire communication on our own we are in a moment to "choose" from existing technologies, not to think about how to create them.

And, the worst truth is: Usually, independently on which chooses we make at this point, the application can still be developed. As I just said, the entire evolution of the application may go better or worse, but it will "be possible".

And apparently that's what most architects do: They choose technologies to write new applications. And something that makes me sad is that they usually don't think about the problem at all, they simply use extremely basic conditions as the parameter to their decisions, like:

· If it is a game, use XNA as it is optimized for games;

· If it is a local application, use WPF (if they love new technologies) / use Windows Forms (if they prefer old technologies);

· If the applications need to communicate with each other, use WCF.

And after those decisions (that is, after the initial "architecture"), they keep working, having to find "work-arounds" over usually bad decisions (or the lack of decisions) they did at the initial stage. After all, if the initial decisions were all right and they aren't going to develop their own framework, why will they continue to work in the project?

Frameworks

I was just saying that the initial stage is to usually choose technologies, like WCF, WPF and the like. For web-sites that will be something like ASP.NET + MVC, Web Forms, caching technologies and the like. All of those technologies can be seen as "frameworks" to do one kind of job.

Well, as an architect I usually have the job of creating frameworks like these. My purpose is rarely to choose the best existing framework, but to make the right decisions to create frameworks like these that work correctly (with good performance, memory consumption, ease of use and most important of all: really expandable).

But I think that you may be scared already: If I want to create a game, will I lose time creating all the technology? That's crazy!

I can agree that for a small project it may seem crazy to write an entire technology when there are others already available. But, first, that's my specialty. Maybe it is not what a company is looking for. Second, in many large projects creating the technology, even if it starts redirecting to another one, opens new possibilities. In fact, I started to create frameworks because most of the time I simply considered the architecture of the already existing ones terrible. It doesn't mean they don't work. It simply means they weren't really helping or making things easier and, in many situations, they were limiting what can be achieved.

But before explaining the problems or the solutions, I will try to explain my view on what is a "framework".

What is a framework?

I frequently see a definition that "you call a library, a framework calls you" and, even if it is OK in the sense that when you use a framework you must "obey" its rules, usually filling events or implementing virtual methods that will be called by the framework, it is very problematic in the sense that some classes may be used directly (like a "library") or inherited (so the virtual methods will be called like a "framework").

Also, any DLL is a library (that's the meaning of the last L), which can contain one or more "frameworks".

So, I prefer to say that there are frameworks in the general sense and in the specific sense. That is, a DLL created to contain a framework "is a framework", but in fact such library can contain isolated classes, usable by any applications, the main framework and even "secondary frameworks".

That is, any solution to a kind of problem, be it build of a single very useful class or by a collection of many classes may be considered a framework. A framework usually has many classes, but in your initial use you may only use the basic methods provided by a single class and only later you may use the extra functionalities.

For example: When you use the BinarySerializer class you are using the basic serialization provided by .NET. But you can create your own serializable classes by using the [Serializable] attribute and even by implementing the ISerializable interface. So, there is an entire framework, but in your initial case you may be using it as a simple "library" class.

A common error: Creating your own framework is bad

The normal arguments I see against frameworks are:

· A framework forces your application to work in a specific direction, avoiding you from doing anything different;

· The code of a framework to solve a problem is harder than solving the problem directly so, in most cases, creating a framework will only add complexity to the project;

· Using the previous definitions, some people say: "Create a library, not a framework";

· Your framework will never be as feature complete as a framework made by a company dedicated to do that;

· If you quit the company, who will maintain the framework? By buying it from a company we have the guarantee that we will have support.

And I must say that I mostly agree with all the arguments. But the truth is: Any big project ends-up having a framework, be it a well architected one, be it a messed-up one made on top of other frameworks (and that's what some developers that hate frameworks usually do).

That is, developers that avoid creating a framework to buy an external one usually finish with their own framework, based on an external one and it usually has the original limitations + the limitations they may have added to it.

The entire idea is that by using software made by a company we have a better support, better quality etc. But a company dedicated to create a technology don't know our specific needs, so they will give us some "generic" solution. Unskilled programmers may try to do the same and they may end-up doing a very poor job. Very experienced developers may make a better solution for the company, even if it is not as feature complete as the one bought from another company.

So, if you are a really experienced developer (or if you have really experienced developers working for you), it may worth to let them create a framework specific to the company's need.

Architecture - The Bad Ones

When I think about a project, I usually start by thinking what I want to do, then I think the things needed to do the job (the concept of a technology, not the technology itself) and only later I think about the existing technologies that may help me in doing that.

But because I already thought about possible needed technologies without thinking about a specific one, I didn't think about any limitations, any technology specific data or any work-arounds. Then, when I see the existing technologies, what usually makes me decide to create a framework on my own is that those technologies expect the application to be done "to use them" and, even if using one technology in this situation is acceptable, I can't put two external technologies (frameworks) to work together, as one doesn't know about the existence of the other and that's a requirement of the other framework.

What I mean by "they expect the application to use them"?

Well, they expect that your code is written to:

· Inherit from their base classes;

· Implement their interfaces;

· Use their attributes;

· Or anything like this, which requires the code to be compiled with a reference to them.

And so, if you use objects of Framework A (which doesn't know about Framework B) you can't use those objects with the Framework B if you don't create adapters.

Creating adapters work but, in some cases, it is a waste of time. When we use a framework like Serialization we want to "convert object instances to bytes" without caring how to do it. But if we need to create an adapter that's serializable, why not write the serialization by hand? And worst, you may have a really big graph of objects, and only one of the objects may not be marked as [Serializable], even if it is extremely easy to make it serializable. But you still need to recreate the entire "adapted" graph to solve such a problem as you can't change the source code of an external framework.

And, if you think you can create something to automate the entire adapted graph, you will be creating a "framework" to create adapters. So, why not create the right framework directly?

Note: I already talked about how the attributes violate the Single Responsibility Principle in the article Attributes vs. Single Responsibility Principle. There some people argued that [Attribute]s aren't code. That they are attached to classes/properties and not part of them. Yet, consider the problem of third-party libraries. You can't change their source code to add the attributes and you can't add attributes at run-time (well, at least not until .NET 4... I am not sure if that's possible in .NET 4.5).

Also, there other kinds of problem. Usually they aren't as bad, but I consider them to be very annoying. This happens on frameworks that expect to find some configuration directly in the configuration file, without giving you a chance to set such a configuration from code or on frameworks that do some kind of action automatically but don't allow you to extend such an action, only to replace it (and worst, that usually must be made instance by instance when a global extensibility point would be better).

So, which frameworks I consider problematic? Most of them, even if they are world widely used.

This includes:

· WPF "convert" bindings;

· The TypeConverters in general;

· Default binary serialization of .NET;

· Default XML serialization of .NET;

· WCF attribute-driven architecture;

· MarshalByRefObject and all the classes that already inherit from or depend on it;

· Most ORM frameworks which usually are attribute based, configuration file(s) based and constrained to database-types requiring adapters to be created if we want the data to be presented with application-specific data-types.

I am not saying that those frameworks don't work. They work. But they could be better.

So, explaining each point:

· If you don't specify a Converter in a WPF binding, it is able to do some automatic conversion. I think it is able to use the [TypeConverter]s, which are already limited. But you can't register a converter from type A to B to be global to your application if those types are from unrelated assemblies;

· The [TypeConverter]s in theory can convert any type to any type, but they require the attribute to be used in one of the two types (be it the source or the destination) and a single type-converter must know all the possible conversions. So, considering that we may have types that are easy to convert from one to the other but are coming from unrelated libraries, we are stuck. So those type-converters end-up used only to convert to and from strings or some of the primitive types;

· The .NET Binary serialization can't serialize a type that's not marked as [Serializable]. It is not important if you know how to serialize it. This is even worse if only the deeper level of a big graph is non-serializable;

· The .NET xml serialization doesn't share the binary serialization attributes, so if you create a class that can be serialized by both you need to remember to use the attributes for both;

· What can I say, you can't get a component that's already made to work as a service (for example, stateless and using only basic data-types) if it doesn't use all the "contracts" expected by WCF. I will explain a little more on this later;

· The entire idea of the MarshalByRefObject is that all the calls become virtual, even if you mark the class as sealed, so they could be "replaced". Well, interfaces are purely virtual, but with an interface you have the option to use the calls as purely virtual or to continue using the rightly typed class, avoiding any virtual call. With the MarshalByRefObject you always end-up doing virtual calls, even when you don't want to. So, tell me how many times did you open a file (FileStream) and really expected it to be replaced by another class? Why not use a Stream (or better, if it existed, an IStream) when you want any stream, and use non-virtual calls to the FileStream when you know its exact type? Unfortunately, by being a MarshalByRefObject you can declare the variable with the real sealed type and the virtual calls will continue to be done.

IoC - Inversion of Control

Maybe I am getting a little off-topic here, but another thing that annoys me is the now popular idea of Inversion of Control. In fact, I already consider it a bad name. If the correct architecture is to "invert control" and people respects it, then it becomes the "normal", not the "inverted" architecture.

To allow such an inversion of control it is recommended that you only depend on "interfaces" not on "implementations", but such a solution is not the best solution all the time. Some components may expect to work only with their "family" components, not with any other component. So, if you use IoC with them, you must use IoC for the entire family, effectively being able to replace one family by another one, not to replace individual components.

That's the case with ADO.NET connections, commands, parameters and the like. You can replace the entire SQL Server family by the entire Oracle family, but you can't replace only the connection without replacing the other components. So, if you are not writing the application that uses components like that, but writing such components (I mean, any component family or framework) you don't really require to make one component to talk to the others only by the interface. Having the interface is good to avoid the need for adapters if you want to replace the entire "family", but the components can talk to each other knowing by their right types.

In fact, the best architecture in such a case is to have the interfaces declared in a common assembly (DLL) and to implement the specific "families" in other DLLs. The users will then be able to depend only on the common assembly and thanks to an IoC container choose which "family" to use at run-time. But each family can be written depending directly on their family components, avoiding the interfaces, the virtual calls, having access to internal fields, properties and methods and also avoiding the IoC completely.

So, if you think that you should make every class only talk to other classes by interfaces, well, think again.

A Note About "Component Families"

I just talked about ADO.NET to explain the component families and a thing I see very frequently is a solution that "loads" drivers using a "rigid" rule.

That is, ADO.NET uses entries in the configuration file (and the Machine.config) to search for database drivers by name if you use the DbProviderFactories.GetFactory() method. That's a extremely rigid rule. The application can't tell how to search the drivers differently.

Even if you can load the drivers without using the DbProviderFactories, remember about such a problem if you create your own "basic solution" capable of loading drivers. You may look for drivers locally or by using some rigid rule like that but, if one isn't found, allow an event to do the search. The AppDomain.AssemblyResolve is an example of how you can create an event to solve that "missing information" problem and it already allows some clever usages, like embedding the libraries into the application while allowing them to be found only when requested.

Architecture - First "Fix"

I know that most of us simply can't solve the architecture problems of already existing frameworks. But, if you work in the creation of some framework, there's an "easy-fix" to most of the problems, and it is very similar to the AssemblyResolve event: Call an event to try to do the job before failing.

If we see what's happening in most cases, it is like this: A framework wants some more information to finish its job and to find such an information it may:

· Read a configuration file;

· Read an attribute;

· Cast your instance to an interface.

And, if it isn't able to do that, it simply fails/throws an exception.

So, why not call an event at that moment, giving all the information you already have (that is, the instance you are working on, the action you want and the parameters you already have, like a conversion from a value X to a specific type) and let the event tell you if it was able to do the job or not?

Only in a situation where the event doesn't do the job you generate the error/exception.

This will solve the .NET binary serialization, the XML serialization, may enable WCF to use types that don't have the right attributes and all of that. And, the best of all: As it is not a change to existing methods, but a new event, it will not cause a breaking change as old code will simply ignore the existence of such an event.

Only to finish explaining the fix comparing to the previously presented problems, the MarshalByRefObject is from another kind, which can be solved by using interfaces. And about the ORMs, well, there are many ORMs with different kinds of problems, some of them will benefit from such an event call.

Improvement to the first fix.

Note that the first fix is subjective already. For example, the event could be used to know if a type is [Serializable] or not, even if it doesn't has the attribute. This will solve the problem for types that have a serializable structure but not the attribute (and can be even considered a source of bugs if used incorrectly), but it will not help with types that don't have a valid structure but could be serialized by an user-made algorithm.

So, calling the event asking to serialize a type that's not serializable (instead of trying to consider it [Serializable]) would be much more appropriate. Yet MulticastDelegates aren't optimized to have a single answer. That is, there could be more than 30 (or even 300) event handlers attached, each one dedicated to a single type. Should we execute all the handlers all the time?

That means we may require another solution (well, at least if we want an optimal solution, as by simply having the event it is already possible to build a better solution on top of it). Mine solution for the serialization problem is to try to find a serializer for such a type, and then register such serializer in a dictionary. That is, I don't ask to serialize a given instance, I ask if there's a serializer for such a type and, if there is, I know that I can serialize other instances of the same type without having to call the event again (yes, I wrote my own serialization framework).

Well, for the entire concept I wrote an article called Actionless Frameworks, so check that article if you are interested.

Architecture - Second "Fix"

The first fix by itself may suffer from another problem: Be too local.

That is, for the serialization problem we may create a solution where the serializer has an event to serialize types it is not naturally capable of serializing. But will we add the handler per instance?

Even if you think it is appropriate (and it usually is), it is also very important to avoid repetitions and so, by the same way a type that has the [Serializable] attribute doesn't need to be "added" as valid per serializer instance, it is very important to have global solutions.

In fact, we can say that the first "fix" should exist as a static solution so it can work globally. If you add local and global solutions or only global solutions is not that important, as a global solution, if well written by the user of your code, could work correctly for local situations. The opposite, unfortunately, isn't true.

Architecture - Services

Now I will stop focusing on the fact that I like to create frameworks or on the problems of existing framework as you may be the kind of person that says that you will not create a framework and you will accept the limitations of the existing ones.

So I will talk about SOA (Service Oriented Architecture). It is a common idea now that we should use SOA as such architecture allows every service to be created as a separate application, even using different languages if necessary, and allowing many advantages like distributed processing, real isolation of failure points and many others.

The only thing that SOA really requires is communication. And, even if SOA already means architecture, every service also requires an inner architecture and, at least in .NET, the most accepted technology to allow the SOA to work is WCF.

Well, I just complained about WCF being too attribute based, but you may consider it OK as you will create a new WCF service and implement it as WCF from the start. So, the fact that it uses WCF specific attributes is not a concern at all... right?

And here is where I consider that many applications have a big lack of architecture.

A service is created to do some kind of job/solve some kind of problem. Such kind of solution may work very well as a [web] service. Yet the solution can (and I dare to say that in most cases it should) exist independently of the communication framework that's used.

One of the possible reasons is: Imagine that you decide that for a particular application you will embed the service in it or even that you will use a complete different service technology. Wouldn't it be much better if such "service" is a simple "library", without any WCF specific data?

So, the WCF part could be completely stripped away without problems. That is, the basic architecture may be: "Create any service as a library". Then, if you want to make it accessible as a real web-service, you create another application that's bound to the service and only fills the information needed to expose the library as a service.

And that's my problem with WCF. While it could be possible to transform a normal library that's already stateless into a service by simply "registering" the type as a service (the old and almost obsolete .NET remoting supports that) in WCF we should have a class full of attributes which, in a situation like the one I am describing, means it is necessary to have an "adapter" class per service class, only to add the needed attributes and redirect to the original, attribute-free, library.

But the worst problem I see is that many people will simply write all the code inside the service directly and, if needed, will import the service with a lot of unnecessary attributes to "embed" a service into an application.

Note: I am not discussing the fact that WCF can use different transfer protocols and all of that. I myself created a framework that allows local communications over Memory Mapped Files that's almost 30 times faster than the best WCF configuration I found for local communication. To me, WCF is very optimized for remote communications and do great jobs, but it is far from ideal for local communication, independently of its support for binary communication and pipes.

Architecture - Program to interfaces - Real situations

A common expression that I usually hear and see is that we should "write to interfaces, not to implementations". This is usually justified for things like IoC, testing and a lot of "amazing" things. Yet, as I explained in the IoC topic, simply making every component talk to others by interfaces is bad. Family of components expect to work with their relatives.

Yet a very common situation that I see frequently is people trying to do globalization by using resources directly. That is, the code is dependent on the resources API and is not capable of working with non-resource solutions.

I can go one step further and say that globalization is a kind of feature we should consider the use of a "framework" or a "service" (or even both, depending on some kind of configuration).

So, how can we achieve such a support for both? I just answered that. Interfaces.

Having all "services" seen as interfaces locally allows those services to be implemented differently without breaking your code. That is, a basic application may implement the service to respond that it doesn't find any translation (and I am already considering the program uses some language, like English, by default), a little better implementation may use a text file to find translations, some other implementations may use specific resource files and some others may redirect to an external service or even find those translations using a database.

So, one of the good things that programming to SOA do is that references to other services are usually already implemented by the use of interfaces. So, if you have an interface, you can change the actual implementation without problems.

Architecture - Application

When talking about SOA I said that one of the advantages is that services usually are presented as interfaces, so the code is already prepared to be "replaced" by another implementation.

But that's a half truth. Surely by using interfaces we can replace one instance by another one. But how are we getting our instances?

A common architecture problem of SOA consumers is that they call the service library directly to create the service instances and so, even with interfaces that allow the implementation to be replaced, they are completely bound to the technology that implement those interfaces. The code simply can't replace one implementation by another one, as the "start point" is already the service library (be it WCF or another one).

So, following the same principle that we should make our service as a library and only later, if needed, create the service (as a separate program that uses such a library) we should program the application in a manner that it doesn't directly see the communication layer/technology. That is, when you program your application, it can't ask an instance of the service IMyService to WCF (so, your application should not see the ChannelFactory, the ClientBase or the System.ServiceModel.dll directly).

That is, you can use an IoC container or you can create your own class that will work as your "factory", which can use an event to create the implementations to the interfaces (services) you will ask. Then your code should only use such IoC container or factory as the starting point. With this extra "layer" you will be able to replace the creation of a service from a specific library to a "generic" one, and so you will be able to replace the implementation at any moment (including a local service instead of a remote one) without breaking all the places that instantiate the service.

Façades

Now that I presented the case that to call a service you should not ask directly to the service library to create the service instances, I will talk about something that's a little counter-intuitive.

I was just saying that we should use interfaces so the code can be easily replaced. But, for many situations, it is better to give some sealed solutions. Especially when talking about web services, as it is a common practice to pass all the needed parameters per call.

Compare this with normal objects that are created, their properties are filled and only later one or more calls are done, without any parameters or with a very reduced list of parameters.

So, to achieve this, we should use façades. We should create local objects that have a "local approach" to use the services, even if they internally redirect to one of those interfaces that have many parameters (and to which you may want to use some default values).

As I said, this may seem counter-intuitive as I was just saying to program to interfaces, to avoid adapters and all, and that will be an "adapter" that uses sealed or even static classes. But that doesn't mean that you will be bound to an implementation, as such façade will still use your IoC container or configurable factory. This will only mean that the users will not see the interfaces and the factory all the time. The developers creating the service and configuring the factory will see that, but the developers that will only use the service will simply see local classes that do the work correctly, without having to bother about condensed method calls and interfaces.

Conclusion

I hope that after reading this article you can see that home-made (or company made) frameworks aren't that bad. That worldwide known frameworks aren't necessarily more prepared to help your application evolve than a framework that you can write and that you can see that home-made frameworks can benefit from using actually existing frameworks while keeping the capacity to completely replace an old external framework by a new one without implying changes to the application itself, only requiring to fill some adapters if such frameworks aren't already prepared to adapt to your code.

And the most important conclusion of all is that if you write a framework, allow such a framework to be used in applications that already reference other libraries that aren't going to change, so allow any information that's required by your code to be found using different methods by creating an event to fill such an information if it wasn't already given to your framework by other means.

Reference From

http://www.codeproject.com/Articles/680661/Software-Architecture

By Paulo Zemek

“In learning you will teach, and in teaching you will learn.”

Thursday, 9 April 2015

Software Architecture - Examples

Fictional Purpose

No Architecture - "Being Too Agile"

Code Reuse... Or Not

Connectivity Problem

Web Service

Starting Differently

The applications

The Basic Object-Oriended Approach

No ORM

Object Oriented without Limitations

Moving problems forward?

Stateful and Stateless - The False Assumption

Risks

MVVM, MVC, ORMs and everything else

Software Architecture Introduction

What should you required to learn machine learning

GA4

Home

Search This Blog