the case

the question is the difference between reference types and value types in C# and how to prove the correct behavior of types with xUnit

toc

findings

1. reference types and value types

2. type test

  • create a new file / new class for type tests
  • create a binding that invokes a method accepting a name of the object
    • this private method (no keyword needed, private is default) constructs the object
    • no fact attribute on that method

type-tests-method-creation

  • return type object is default on a new method
    • this is lowest based type in .NET
  Book GetBook(string name) //private keyword skipped; return type Book
        {
            return new Book(name);
        }

3. test: can 2 different vars refer the same object ?

  • Use Assert.Same()
   [Fact]
        public void TwoVarsReferSameObject()
        {
            var book1 = GetBook("Book 1");
            var book2 = book1;
            Assert.Same(book1, book2); //prove identity of the pointers for bindings
        }
  • Assert.Same() in xunit is an abstraction of the .NET Object.ReferenceEquals()
    • everything in .NET derives from the object base class
    • the above test could be re-written as
`Assert.True(Object.ReferenceEquals(book1, book2));` 

4. test: prove you can change the name of a book

  • proving that it is possible to change the name from the reference
  • note: in languages, there are types of passing a parameters into a method called also a) by reference; b) by value; in c# a parameter itself is always passed by value
      public void CanSetNameFromReference()
        {
            var book1 = GetBook("Book 1"); //1. instantiate an object
            SetName(book1, "New Name"); //2. copy the value inside of book 1

            Assert.Equal("New Name", book1.Name); //prove that that reference has change its field

        }

        private void SetName(Book book, string name) //3. paste the value of book 1 into the first parameter
        {
            book.Name = name;
        }

  • the value that is passed is a POINTER to a memory location (an address, a reference to a book object)
  • you don’t get to the pointer value even in a debugger, there is a barrier and practice considered to be unsafe
  • IF this runtime would pass parameters by reference, the book parameter in SetName method would not receive a pointer value, but a reference to the variable book1 ➔ there would be 2 references
    • reference of the parameter to book 1
    • reference within book 1 to the object
    • in this scenario it’s possible to make changes to the book1 binding itself from the other method
    • you cannot change that binding in pass by value scenario, ever

5. passing parameters by reference

  • the following method does not change the name of the book1
  • the parameter book1 passed into the GetBookSetName is different than the location it is pointing to
   [Fact]
        public void CSharpIsPassByValue()
        {
            var book1 = GetBook("Book 1"); // initialize as Book 1
            GetBookSetName(book1, "New Name"); // try to rename to New Name

            Assert.Equal("New Name", book1.Name); // validate success ➔ failure

        }
        private void GetBookSetName(Book book, string name) // value of reference is passed here (pointer to book1)
        {
            book = new Book(name);
            book.Name = name;
        }

          Book GetBook(string name)
        {
            return new Book(name);
        }

and that’s exactly what the designers of the c# language wanted. when you pass a variable to another method, you don’t want that method to unexpectedly change the value or the reference that is inside of your variable. that would be an unexpected side effect

  • how to rename then?
    • force to pass the parameter by reference instead of by value ➔ force to pass the whole object into the method and work on that
    • use the ref
[Fact]
        public void CSharpCanPassByRef()
        {
            var book1 = GetBook("Book 1");                      //1. Pass the name "Book 1" into a getter that constructs
            GetBookSetName(ref book1, "New Name");              //4. Pass the reference to book 1 with new name into a getter that renames

            Assert.Equal("New Name", book1.Name);

        }
        
        private void GetBookSetName(ref Book book, string name)  //5. Accepts class references (not literals)
        {
            book = new Book(name);                               //6. Overwrites that is passed by ref / reconstruct with new name
        }
        
        Book GetBook(string name)                                //2. Assign the passed literal to the name parameter
        {
            return new Book(name);                               //3. Return constructed object with the passed-in literal
        }
  • ref overwrites, reconstructs, allows to work on the existing object within the method where it’s passed

pass-by-ref

  • not much used (some APIs)
  • there is also out keyword
    • difference: compiler assumes the binding has not been initialized

6. working with data types

  • the following fails
[Fact]
        public void Test1()
        {
            int test_binding = GetInt(); // 1. test_binding == 3
            SetInt(test_binding); // 2. pass value 3 into the setter

            Assert.Equal(42, test_binding); // 5. the test, perhaps as suprise, FAILS
        }

        private void SetInt(int test_parameter) // 3. assign the value 3 to test_parameter
        {
            test_parameter = 42; // 4. overwrite the memory location of test_parameter with the value of 43 ➔ NO CHANGE TO THE VALUE OF test_binding
        }

        private int GetInt()
        {
            return 3;
        }

  • in order to pass the compiler has to be instructed to pass by reference explicitly with the ref keyword
    [Fact]
        public void TestDataTypes()
        {
            int test_binding = GetInt(); // 1. test_binding == 3
            SetInt(ref test_binding); // 2. pass the reference to the test_binding into the method

            Assert.Equal(42, test_binding); // 5. the test PASSES
        }

        private void SetInt(ref int test_parameter) // 3. accept only references to memory locations and not values ➔ assign the pointer to test_parameter
        {
            test_parameter = 42; // 4. the value of the reference (test_binding itself) will change to 42
        }

        private int GetInt()
        {
            return 3;
        }

7. question: differenciation between value and reference types

  • if working with any type defined within a class ➔ r e f e r e n c e t y p e
public class foo {
    
    // all bindings created here are references
}

  • working with classes is the bread and butter of day to day workk
  • to work with a data type, use struct
    • needs to behave like a value type
    • typically very small (int; float; DateTime)
    • struct is an abbreviation of STRUCTURE, data structure that is
    • just grouping a number of fields as opposed to class with methods
  • this can be much more efficient for certain scenarios, but this has to be understood properly
public struct foo {

    // bindings created here are values
}
  • there are types within the .NET framework that are struct based
    • you need to be able to tell, because when invoking a method and passing along a parameter that is a reference type, it is possible to make changes to fields inside of that object if that is a reference type
  • to make sure what type are you working with, press f12 when on that type ➔ metadata view of the type
    • note: int is an alias for Int32; double for the actual Double; DateTime has no available aliases

metadata-type-view-f12

8. strings in c#: a special case; inconsistency tripping up the newcomers

  • strings in c# is always a reference type
  • but it often behaves like a value type
  • strings are reference types, but they are immutable. You cannot modify an existing string once created, only replace it
  • the following test is a fail
   [Fact]
        public void StringsBehaveLikeValueTypes()
        {
            string name = "Pavol";          //1. assing "Pavol" to reference type string
            MakeUppercase(name);            //2. pass string into a method
            Assert.Equal("PAVOL", name);    //5. failed test
        }

        private void MakeUppercase(string parameterToUpper) //3. assign the value of the argument's reference to the parameter
        {
            parameterToUpper.ToUpper();                     //4. run the capitalization method on the parameter (no effect!)
        }

debug-string-unit-test

  • none of the methods you can perform on a string will manipulate the existing string. They all return a new string with desired attributes
  • to make the test pass the following changes are necessary
       [Fact]
        public void StringsBehaveLikeValueTypes()
        {
            string name = "Pavol";
            string nameToUpper = MakeUppercase(name); // 1. bind the returned value to a new string binding
            Assert.Equal("PAVOL", nameToUpper);
        }

        private string MakeUppercase(string parameterToUpper) // 2. change the return type from void to string
        {
            return parameterToUpper.ToUpper(); // 3. return the capitalized copy of the parameter
        }

8.1. note: stack is an implementation detail

I find this characterization of a value type based on its implementation details rather than its observable characteristics to be both confusing and unfortunate. Surely the most relevant fact about value types is not the implementation detail of how they are allocated, but rather the by-design semantic meaning of “value type”, namely that they are always copied “by value”. If the relevant thing was their allocation details then we’d have called them “heap types” and “stack types”. But that’s not relevant most of the time. Most of the time the relevant thing is their copying and identity semantics.

9. taking advantage of garbage collection

  • this is managed language
  • garbage collector run automatically, memory is managed automatically as well

sources