Streams, Readers and Writers in C#


8 February 2013, by

I have recently started to hold weekly sessions in which I attempt to explain a subject that I think I know something about, to a small group of people who think they don’t but want to. The first one was very successful – at least if you judge by the amount that I learnt in the process!

The topic of the session was Streams in C#. But it rapidly transpired that the problem wasn’t so much Streams as “Streams and all the other stuff you tend to use with them”, so we spent a lot of time learning about the various Writer classes in the C# base class library. The number one conclusion was that you really need to Just Know which thing to use in which situation, and this blog post attempts to summarise the key things everyone will have forgotten by the time they next use a Stream or something of its ilk.

I’ll start with a little background, and then move on to a ready reference.

What is a Stream?

A Stream is defined by MSDN as: “A generic view of a sequence of bytes”. I like to think of it as a Thing, into which you can put bytes, and / or out of which you can get bytes. It may in general Do Something to the bytes as they pass by, and will almost definitely store them somewhere – either a place of the Stream’s choosing (e.g. a file), or another Stream.

The key thing is that a Stream deals with bytes. In some ways the simplest stream is a MemoryStream – and that is just an array of bytes, and a pointer to the “current position” in the array so you can read / write to the array sequentially.

Streams can be read-only, write-only, or read-write. Note that this is different from Java, where InputStream and OutputStream are separate classes.

In general, Streams are lazy. If you put something in, it doesn’t necessarily come straight out the other side – it may be buffered. But also in general you don’t have to wait until you’ve finished putting something in before the data comes out. So to take a FileStream as an example, if you write bytes to it at a steady rate for a period of time, after a few minutes you will find there’s some stuff on disk, but chances are not everything you’ve written will be on disk yet.

What is a Reader? What is a Writer?

There’s actually no such thing as a Reader or Writer class, but there are various BCL classes whose name ends in “…Reader” and “…Writer”. Broadly speaking, these classes are responsible for converting things from one format to another. They typically use a stream-like approach in as much as you feed things in one end and a converted form comes out of the other end as you go, in a lazy manner but without waiting for you to finish writing stuff in.

This is best illustrated by considering the most common Writer, which is the StreamWriter. This is a Writer that can:

  • Take an object and convert it to text in a given encoding
  • Write the result to a Stream

“Convert it to text” is really just the same as calling ToString(), but the StreamWriter will nicely encapsulate concerns around text encoding (UTF8 etc), and deal with converting the result into the bytes necessary to pass to the Stream.

Stream Classes

Here’s a ready reference of Stream classes in the .NET BCL. It’s not supposed to be completely comprehensive, but it should cover the most common uses.

FileStream

Reads or writes to a file. There are other ways to do file access – for example reading the entire contents of a file can be done by calling File.ReadAllText. However using a Stream has the benefits of:

  • Interchangeability with other Streams in your code – no need for the rest of your code to rely on the backing store being a file.
  • Streaming the data – you don’t need to read the whole file into memory at once, but can read it as you process it.

MemoryStream

Reads or writes to an array of bytes in memory. If you create a MemoryStream with no constructor arguments, you get a byte array that automatically expands when you need to. MemoryStreams are useful if you want to mock out a “proper” Stream, say for testing purposes. They can also be useful if you have Stream-based code but don’t actually want to persist the resulting data anywhere – but be careful because by doing this you’re losing the advantage of Streams that not all data needs to be loaded into memory at once, and there’s probably a neater way.

When using a MemoryStream you will probably want to put some data in, and then get it back out again. You can write to it using a StreamWriter and read using a StreamReader, perhaps – see below. But you should watch out for two caveats:

  • The MemoryStream has a pointer into its byte array telling it the “current position” in the array. This is shared for reads and writes. So if you write some data, and then switch to reading, you’ll start reading towards the end of the array. Fix this by calling Seek(0, SeekOffset.Begin) on the Stream to get back to the beginning before you start reading.
  • If you’ve been a good programmer and have put your StreamWriter in a using-block, you will be unable to pass the same MemoryStream into a StreamReader. This is because the Writer’s Dispose method will Close your Stream. Possible solutions are to not call Dispose on the Writer (which is apparently ok, if you know what you’re doing), or create two separate MemoryStreams (you can call GetBuffer on the first, and hence feed the byte array into the second – making sure that you either Close/Dispose the Writer, or call Flush, before you do so to make sure the Writer has finished writing everything to the Stream).

NetworkStream and other network streams

Reads or writes to a network socket. You can also get a Stream back from methods like HttpWebRequest.GetResponseStream() – they all work in essentially the same way.

Other streams I hadn’t come across before

A BufferedStream lets you control just how lazy the next stream in the sequence effectively is – if you wrap a NetworkStream in a BufferedStream, then writes to the Stream will only actually go out over the network when there’s enough data, which may improve performance.

There’s a System.IO.Compression.GZipStream – handy for compressing stuff on the fly. (And a DeflateStream in the same place).

And there’s a System.Security.Cryptography.CryptoStream, if you want to encrypt your data. Remember that Streams like this can be chained together – the CryptoStream takes a constructor argument which is the next Stream in the sequence, which might be a GZipStream, which takes a FileStream, if you want to write encrypted, compressed data to your file.

TextReader and TextWriter Classes

TextWriters are able to “write a sequential series of characters”. In other words, they write text. A TextWriter will handle the conversion of your input data (any object) to a string – calling the ToString method, basically – and the translation into the specified character encoding.

TextReaders just provide the ability to read data from a source as characters, either a character at a time (Read), a line at a time (ReadLine), or the whole file at a time (ReadToEnd). Again a key aspect of what they do is handling the encoding.

StreamReader and StreamWriter

Reads or writes text to/from a Stream. So to write text to a file, use a StreamWriter and pass in a FileStream to the constructor.

StringReader and StringWriter

Reads or writes text to/from a String.

The StringWriter does this by containing a StringBuilder, which can do pretty much the same thing on its own without needing to be wrapped in a StringWriter. However, by using the StringWriter you get the benefit of encapsulated handling of text encoding and interchangeability with other TextWriter implementations.

TextWriter.Null

A handy place to send output to if you want to throw it away. A bit like piping a Unix command to “/dev/null”, or a Windows command to “nul”.

BinaryReader and BinaryWriter

These classes read or write any primitive C# type to a Stream in a binary encoding. So for example Write(int) will write 4 bytes to the underlying stream.

XmlReader and XmlWriter

These classes read or write XML formatted data. XmlWriter is probably the more commonly used, and includes the ability to control how the XML is output. The usage is a bit unintuitive, so to save you time here’s how to create one (note that it returns an instance of the concrete subclass XmlTextWriter, but you shouldn’t try to create one of these manually):

var xmlWriter = XmlWriter.Create(target, new XmlWriterSettings() {Indent = false});

Note that an XmlWriter can write to almost anything mentioned in this post! In particular you can write to a file, or to a TextWriter (allowing you to control the character encoding), or to a Stream (allowing you to control almost anything, in principle – you can encrypt and gzip the data as it goes). This principle is not uncommon in the .NET BCL – perhaps indeed it’s the root cause of the confusion between Streams, Writers, etc. But provided you bear in mind what the different types of class are designed for, you will hopefully be able to work out which is the appropriate level of abstraction to use in your code when you need to.

Tags: , , ,

Categories: Technical

«
»

Leave a Reply

* Mandatory fields


eight − 3 =

Submit Comment