Strings and Immutability
In the programming world, a string is an array of System.Char characters that when put together represent text. In the C# programming language, you can declare a string and print out its value as follows:
string str = "Hello World";
Console.WriteLine(str);
//This will print out "Hello World" to the console.
When you create a string, it is immutable. That means it is read-only. When something is immutable or read-only, it means it cannot be changed later.
Why Strings are Immutable
In C#, the CLR (Common Language Runtime) is responsible for determining where to store strings. In the last section I noted that a string is an array of characters. The CLR implements an array to store strings. Arrays are a fixed size data structure, meaning that they cannot be dynamically increased or decreased in size. Once an array is assigned a size, the size cannot be changed. To make an array larger, the data must be copied and cloned into a new array, which is put into a new block of memory by the CLR. If you edit a string, you are really not modifying that string; rather, the CLR is creating a new memory reference for the modified string, and the original string will get removed from memory via garbage collection.
Let’s Look Under the Hood
While it is not imperative, it is absolutely important to know what is going on in memory while you are writing your code, whether it’s with strings or data structures or something else. It helps in a variety of ways, whether it be fixing software bugs or diagnosing memory leaks or performance issues. I find the best comparison is knowing what is going on under the hood of your vehicle. While it is not critical to know, it is helpful to know the parts under the hood of a vehicle and what purpose they serve in case of vehicle issues.
Let’s look closer at the example from the previous section. We will create a new string, then modify it and explain what is happening in memory during all of this.
string str = "Hello World";
Console.WriteLine(str);
//This will print out "Hello World" to the console.
In the program, we are creating a string object named str and assigning it the value “Hello World”.
However, in memory, the CLR is creating blocks of space to store this variable. For simplicity, let’s say the CLR uses memory location 1000 to store str. Since this is an object, the CLR will store this on the heap, not on the stack. Now, let’s modify this string.
str += " edited";
Console.WriteLine(str);
//This will print out "Hello World edited" to the console.
When you run this code, you will see “Hello World edited”. Since this string is immutable, the CLR is again creating new blocks of space in memory to store this variable. The CLR will assign a new memory location, let’s say location 1500 for this new variable. Eventually, the garbage collector will dispose of the original string stored in location 1000 and clear it out of memory.
Pros and Cons
Like almost everything else, there are reasons to use immutable strings, and there are reasons not to use immutable strings. Why should you use immutable strings? One advantage is that they are thread safe. If you are working with a multi-threaded system, there will be no risk of a deadlock or any concurrency issues, since when you modify a string, you are really just creating a new object in memory. Another advantage is that you will not have to worry about accidentally changing them. You do not need to take the additional safety measures (i.e. a defensive object copy) that you may need to take with a mutable object.
Why are immutable strings a bad idea? The main issue is that constantly changing strings can lead to performance issues. We will explain this in a code block. If you refer back to the code snippets from the previous section, you will see that we only modified the string one time. Suppose we have a scenario like this:
string str = "Hello World";
Console.WriteLine(str);
//This will print out "Hello World" to the console.
for (int i = 0; i < 10; i++)
{
str += " again";
Console.WriteLine(str);
}
This code prints “again” for each iteration (ten times) after the original “Hello World”. However, for each iteration, since the string is immutable, what is happening in memory is that ten times the CLR is allocating new space in memory and storing a new str variable, and each time it’s creating bigger blocks to save more data.
Source: Medium - MBARK T3STO
The Tech Platform
Comments