Java and C#
I want to start by first doing a quick comparison between languages on the JVM and .NET. The main differentiators that I see between the JVM and .NET are twofold: -
The JVM is much more common on non-Windows servers e.g. LAMP stacks. Yes, there’s Mono, which is becoming more popular, but generally this domain is taken by Java. Between the two most popular languages across the two platforms – Java and C# – Java has slowly stagnated over the years, caused apparently by over-deliberating by committee on details of various features whilst C# has gone steaming ahead and introduced genuinely revolutionary features to OO developers such as queries and asynchronous programming as first class citizens, and also functional programming constructs.
Whilst Java (and indeed the JVM) has a fundamentally limited implementation of generics which will likely never change, C# and .NET have generics baked throughout the runtime. Whilst Java 8 will (in 2014) finally get lambda expressions (something which I remember Java developers passionately castigating in 2008 when C#3 first came out), again, because it doesn’t have a notion of a delegate, it won’t be as elegant as the C# implementation, involving (as I understand it) compiler tricks to turn lambdas into single method interfaces.
One thing both Java and C# do have in common is that they are rapidly becoming languages that will be prevalent on the server-side, notwithstanding e.g. internal LOB applications or native mobile applications (even that’s a bit of a hot potato with things like Xamarin – maybe best left for a discussion on another day).
The good, the bad and the ugly of C#
So C# has given us developers some pretty good things. Yet in a sense, since C#3 came out, the language hasn’t moved forward that much. Dynamic was introduced, we got better co- and contra-variance, and C#5 introduced async/await. All excellent features (particularly the last one), but where do we go from here? I’ll go back to my comment about lambda expressions on Java again here. Some of the main criticisms of LINQ when it first came out by the Java community were driven by emotion rather than rationale – that LINQ was just a poor man’s SQL embedded into the language; that C# was now a dynamic language; that extension methods would somehow break encapsulation; that LINQ was just Microsoft’s latest attempt to get new customers and it “wouldn’t work” etc. etc. All nonsense, yet one underlying message from the complaints was that lambdas and LINQ would be too confusing for developers. To an extent I think that this is more of an issue in the Java community, where I see the language being far more OO than C# is nowadays.
But, at the same time, think about a new developer learning C#. There are already many ways to achieve the same thing at both the language and framework level e.g. set based operations, asynchronous operations, multi-threading etc. etc.. Think how many constructs and fundamentals that there are to learn these days; you’d probably get contradictory information and advice from ten experienced .NET developers that have picked up different bits over the years. At least we’ve been drip-fed C# over the years at a rate we can absorb whereas a new developer will have to try to distil that information to a manageable amount. It’s harder and harder to know the “right” way of doing things.
There are some things in the language that we’ll probably never get – things like non-nullable reference types, which would eliminate whole classes of bugs, or the removal of leftovers from .NET 1 that were replaced in .NET 2 like non-generic collections etc., or the multiple-ways-to-do-async etc. etc.
Where does C# go from here? Do the designers continue to add features to it e.g. immutability / more succinct type declarations (something I’ve been hearing might well happen) / improved type inference (something I would love to see)? As each revision of C# goes in, it becomes harder and harder to add new features to it. On the other hand, languages like Scala on the JVM and F# on the .NET framework are changing at a much more rapid pace. Ironically, Scala, to me, seems to me more like C# than F# with its objects-first-but-functional-programming-supported approach, but both it and F# are introducing features to their respective communities.
Is there a case to perhaps keep the changes in C# at a relatively sedate pace now, before it becomes even more complex, and let e.g. F# push the boundaries with new features? What features do we need to see added to C# to improve the programming experience without making the language a mish-mash? Added to this is Rosyln – what impact will that have on the language going forward?
I suppose the main upshot is that I think the days of mega-changes in C#, like we saw in C#3, are over. What I suspect we’ll see in the future are most likely carefully-thought-out smaller features which can improve the experience or perhaps deal with specific problem domains but won’t necessarily change the fundamentals of C# as we saw with LINQ.
I’ve spent a lot of time over the past few years working on multi-developer projects. It’s incredibly important that you ensure that the pit of success is large for other developers. Obviously there will always be a learning experience, which hopefully can be alleviated through pairing and / or decent developer documentation, either through test suites or wiki etc.. But a large part of making a framework discoverable is in choosing the namespaces correctly. I’m talking here about the dreaded “.Core” or “.Framework” areas e.g.
What purpose does this extra “Core” give us? Absolutely nothing! The worst part about something like this is that it only serves to obfuscate the most common parts of your system instead of making them as easy to find as possible. And yet I see it time and time again, on one project after another.
The problem with Framework namespaces
Why is this a problem? Imagine you’re a developer working on some part of the system. Perhaps you’re coding a service that lives in a namespace like Company.App.Services e.g. Company.App.Services.CustomerService. Why should you, as a consumer of the framework, have to know about adding a using statement (or similar) to Company.App.Core.Services in order to use ServiceBase or similar? The answer is – you shouldn’t!
Intellisense should be able to present you with common types as soon as you tell it what you are working on, be it a service, or repository (which I hate – more on that in another post) or whatever else. How do you tell Intellisense what you are working on? By what namespace you’re in. Your core types should live in the same logical namespace as the most likely namespace for consumers of that type. ServiceBase should live in Company.App.Services because this is where your actual services live.
There’s a subtle difference between physical deployment – where framework classes are slowly changing and probably should live outside of the day-to-day changing business code (particularly when writing modular, pluggable code) – and logical namespaces where many types can live across many physical assemblies in that namespace without a problem.
There’s also an oft-repeated mantra that says that your assemblies should be named after the namespaces that they live in. This is fine – in principle. With your framework assemblies, it makes no sense. I recommend that when you start writing common helper classes, core interfaces etc. in your core assemblies, and make the default namespace of those projects the highest that it can be e.g. Company.App. Then make folders in the project for each area that your framework goes across. The name of your assemblies do not need to follow the assembly-follows-namespace standard.
Framework classes should be carefully distributed across the entire namespace of your system. All your framework classes should not get bundled up into one uber-namespace, or bundled underneath .Framework. Framework types relating to Services should live in the same namespace that your team write their services in. Types relating to UI should live in the same namespace that your team write their views in.
Another bugbear I’ve come across lately: Please answer the following question without mentioning the terms “heap” or “stack”: -
What’s the difference between .NET reference and value types?
I’ve asked a number of people this question lately, and, without fail, all of them have answered that the former are allocated on the heap and the latter on the stack. That is indeed one difference, but it’s an implementation detail. It could change tomorrow and your code would still behave in the same way. As a developer of a .NET language, it has a much smaller impact on how you write code than the real answer.
So what’s the real difference between the two? It’s how they behave when passed between methods i.e. value types are copied whereas reference types refer to the same instance.
Most everyone I prompt knows this, but do not think to say it when asked the original question.
Some random thoughts and suggestions follow regarding how you design the namespaces that your classes fit into when writing APIs that other developers will utilise.
Many of these points I’ve recently been dealing with myself whilst having a team of developers joining a team I’m working in; when documenting some of the API and framework, I realised that it was not as easily discoverable or navigable as it could be; the next time I work on a reusable API, I’ll bear these points in mind more. I’d also recommend having a read of Framework Design Guidelines, which I read not too long ago and really opened my eyes in a number of ways to framework design.
Make your classes discoverable. I can’t stress this enough. The namespaces of your API should be easy to navigate, logically laid out and as readable as possible. Put yourself in the shoes of another developer who types MyApplication. in Visual Studio. When they press that “.” key, Intellisense is going to pop up and show them the different areas they can look through. This needs to be as clear as possible in order to guide them through the API; when I start trying out a new API, Intellisense is my first port of call – I certainly don’t start by reading all the documentation on every class.
Distinguish between 80/20 classes. It’s tempting to simply place all your classes that fulfil a particular function into a single namespace, but this often doesn’t help the developer. Instead, consider making a sub-namespace which contains more advanced features or classes (used 20% of the time) and keep the more common ones (used 80% of the time) at the higher, more easily discoverable, namespace. Also be strict regarding encapsulation – don’t make classes / methods / properties etc. public unless you need to.
Do not use department names in your namespace. Physical departments are always changing. At the end of the day they usually really have nothing to do with your API. If your department is restructured or renamed, will your code magically stop working? I would even go as far as to say for internal applications, do not include the company name either – what’s the point.
Beware of rigidly following the “namespace must be a child of the assembly name rule”. If we had an assembly called e.g. “MyApplication.Framework”, all code in that assembly should theoretically go underneath the MyApplication.Framework namespace. However, as an API developer, you might want very frequent and popular classes not to be buried away underneath the Framework level – you might want a developer to simply type “MyApplication.” and see the 4 or 5 most important classes that your API has to offer because that’s all that they might need most of the time. Indeed, they might have nothing to do with one another in terms of functionality – perhaps one is to do with logging, another to do with eventing etc.. It might sometimes be better to have an assembly which does not following the System.Subsystem.BlaBla naming convention and instead is simply called “BlaBla” – you’re not implying a namespace from this so are free to put stuff in any namespace.
Don’t be concerned with file navigation. Don’t worry that the developer cannot infer at a glance what assembly a particular class lives in. They can hit F12 and be transported to the source code (or metadata) of the class immediately; they can use Solution Navigator to find the file, or have Solution Explorer follow the active file etc.. That should not be your main concern when designing your API.
Before I start, this isn’t an “elitist” rant of mine! Now that that’s out of the way…
One thing that often gets on my nerves when interviewing developers is when people rate themselves out of 5 for a particular technology. Generally people tend to rate themselves on the following scale: -
3: I have used this technology a few times and know what it is.
4: I have used it on at least one project before.
5: I have been using it for a while and think that I know it to the fullest.
There is no 1/5 or 2/5 of course. No-one ever gives themselves 1 for a technology on their CV (after all, why bother saying you’re crap at something). And the worst part is, all this rating does is show how much you know that you don’t know in the subject matter. Let’s take C# as an example. Someone might know the following: -
how to create and use generic collections, and understand the benefits from a usability point of view over ArrayList
how to use e.g. foreach loops and basic flow control
they might know how to use LINQ queries on a LINQ-to-SQL data context
However, they may not be aware of: -
how to create their own generic classes, or how generics work underneath
what is the yield keyword and how it can be of use
what is the difference between IEnumerable and IQueryable, or what an expression tree is
That’s all fine. But as far as this person is aware, they’re a 5/5 C# developer because they aren’t even aware of what they are missing out on. Perhaps they give themselves a 4/5 “just in case” there’s a keyword that they might have missed Whenever I interview people, I’d much rather someone give themselves a 2 or 3 out of 5 and be honest about it – or perhaps surprise us with what they know – rather than big themselves up and then fall at the first hurdle. All that does personally is fill me with suspicion – after all, if they’ve given themselves a 5/5 for something and been caught out so quickly, all the other aspects of their CV should be treated with the same suspicion.
My personal advice would be that unless you can back it up, don’t do it.
For what it’s worth – and I don’t consider myself an expert in it – here’s the sort of things I would look for if someone rates themselves out of 5 in C# (note I have tried not to mention things like design patterns etc. etc., but rather just knowledge of the language itself, and I suppose its relationship with .NET as well): -
1/5: Understanding of basic control flow; try/catch statements; methods and overloading; difference between static and instance members; properties; levels of encapsulation; object scope.
2/5: Able to explore and consume new simple APIs with little guidance. Consume and understand generic methods and classes; can create standard LINQ queries and consume IEnumerables / IQueryables. Good understanding of value and reference types, boxing and disposing of objects. Good understanding of object lifetime and the GC. Understands Reflection and its pros/cons.
3/5: Deep understanding of generics. Good understanding of the components of C#3 e.g. Extension methods, lambdas, expression trees etc. Comfortable with usage of Action<T> and Func<T>. Understanding of multi-threaded processes.
4/5: Good understanding of the role of the compiler in C# and its relationship with the .NET runtime. Good understanding of yield keyword. Understanding of closures. Knowledge of Task<T> and it’s relationship with threads. Understanding of dynamic, both in consumption and the underlying technology.
5/5: Expert level in nearly all areas e.g. compiler behaviour, memory allocation, expression trees, GC behaviour etc. etc. Perhaps your name is Eric Lippert, Anders Hejlsberg or Mads Torgersen etc. etc.?
I guess even someone like those individuals that I mention don’t know “everything” and the language because C# and it’s relationship with .NET is now so vast, and so complex, that no-one could possibly know the intricacies of every component of the language.
Now, I bet that people could write some great systems with just a 2/5 or 3/5. I would also say that there is little relationship between someone writing fault tolerant, readable and maintainable code and their knowledge of the language. All it means is that you have more tools at your disposal when faced with a problem; the challenge is knowing when to use the right one for the job at hand.
An experienced Java developer colleague of mine has recently transferred into a .NET based team. He asked me, aside from the obvious things like reading up on the differences between C# and Java and getting a reference book or two on parts of .NET, what are the common practices or attitudes of .NET developers? How do they differ from your typical Java developer?
I think it’s a great question, and it got me wondering – what are the typical behaviours of a .NET developer? What qualities (or lack of!) characterise the average .NET developer? So having worked with quite a few .NET developers in the last decade, I thought I’d share my thoughts on how I see as what the average .NET developer is like, in terms of skills and attitudes.
The attributes of a .NET Developer
Before I jump in, I should point out that firstly, my experiences might not reflect what you have seen; and secondly, that this is what I have seen of a “typical” .NET developer that has some experience – but there are definitely those that I have worked with that did not fit the profile below.
The average .NET developer – let’s call him Bob – generally does not have a great grasp of the GoF design patterns. He may actually use some of them day-to-day, but without realising the application to a proper design pattern. He knows the names of two or three popular patterns such as Factory and Singleton, but often struggles to actually identify opportunities to apply them in a real-world scenario.
Bob sees himself as a client of the services offered by Visual Studio and the .NET framework, and sees little reason to understand why things work the way that they do. This includes the C# compiler, code generation features of Visual Studio, and the classes in the framework itself. This is by no means a bad thing – after all, these things are there to provide services to the developer – but there is often a real lack of understanding with what’s going on under the hood. Situations such as these might be applicable to Bob: -
Bob compiles lots of code every day, but might not know the process of what happens in order to make an executable program.
Bob uses foreach loops all the time, but does not know how they relate to IEnumerable.
Bob loves LINQ, but does not know what deferred execution is, or how query operators relate to extension methods.
Bob loves Entity Framework 4, but does not know what IQueryable is.
He uses List<T> all the time – but has no idea how to write his own generic classes.
Now, the sort of thing above that I’m talking about is not referring to being able to describe the lines of C# and IL that are generated when you use the “yield” keyword – but to at least have an understanding of how state machines fit in with it.
Test Driven Development
Bob has heard of TDD, and thinks it sounds like a great idea, but for the time it would take to apply it. He might even write unit tests from time to time. However, he is not experienced in the art of writing AAA unit tests, or in understanding the difference between testing a single class rather than writing an integration test which cuts across multiple tiers. He might use unit tests, but does not have enough confidence in them to rely on them – so he still wants to manually test the same bit of code out in the debugger as well from time to time.
Bob is happy to use code generation tools that he is aware of in order to rapidly fulfil business requirements, such as the Windows Forms designer, or the RIA Services domain service generator – IMHO a good thing. But he sees these things as tools without wanting to know what it does behind the scenes. What code does it generate? At the same time, he will often be content to write reams of boilerplate code manually rather than create a code generation tool himself, or to ask whether there is another tool out there that will do the job for him.
The Commercial Reality
Bob will sometimes see the importance of writing high quality, maintainable code as an optional nice-to-have in the quest to meet tight deadlines and timescales. He will often prefer the short-term goal of achieving the business requirement of today than building one that can more quickly deliver the dozen business requirements that must be delivered by the end of next week.
Why is this?
The above may sound unfair to many developers. After all, if you’re able to write software today in such a manner, what’s the problem? Well, for one, you could be doing the same thing in much less time, and therefore delivering far better value to your client / company.
Years ago the average C++ developer could not get by without a knowledge of what a pointer is, or what malloc() does, or how to write an efficient sort algorithm. Nowadays the barriers to entry in the world of .NET are much lower. In itself this is a good thing – anything that makes it easier to code by getting rid of having to write boilerplate code gets a thumbs up from me. But it’s also important that we as developers understand the “how things happen” as well as the “what they do”.
Professional development is still, to my mind, a highly skilled trade that we are constantly learning about, and we need to remember that just because we can easily write an application that can communicate over a network, save to a database, give error handling etc. with relative ease, it doesn’t mean that we can or should forget what’s going on behind the scenes.
Just came across this .NET type – very cool.
If you’re creating types and are finding that they are quite heavy duty in terms of memory allocation, you might want to look at Lazy<T>. Example: -
Person is a type which has a large property on it – a 10K byte array. We are creating 100,000 of these Person objects in our PeopleFactory class – that’s around a gigabyte of memory gone, plus the time taken to allocate it etc.
You could normally get around this by doing stuff like this:-
This works just fine. Now, creating 100000 people uses up only 1.2MB; as soon as we access the property, we’ll inflate those objects to use the full 1GB. However, the pattern is pretty repetitive – imagine if we had dozens of properties etc..
Enter Lazy<T>. This handy type encapsulates this sort of logic for us; we can now rewrite our property like so: -
The Func<Byte> that is provided is what’s executed when the Value property is first accessed. So this simplifies your property accessors once again and gives you constructor-like creation in a lazy manner.
The memory usage is the same for the byte arrays, but there is a slightly higher memory footprint initially than in the manually-lazy loaded example above – Lazy<T> is a type in itself which requires memory, plus there’s the delegate to store the code to create the byte array.
However, Lazy<T> is still a useful tool – when you have many properties on a class and want to instantiate them in a lazy fashion e.g. perhaps you need to call a web service to retrieve the data, or go to the database etc. – using Lazy<T> can keep your properties easy to read and still give you lazy loading of them.
This very simple example doesn’t illustrate the full flexibility of the type – look at the MSDN article to see what else it can do.