Haskell – 96 – input e output – 5

Continuo da qui, copio qui, scollare fino a “Just like we have hGetContents that works like getContents”.

Just like we have hGetContents that works like getContents but for a specific file, there’s also hGetLine, hPutStr, hPutStrLn, hGetChar, etc. They work just like their counterparts without the h, only they take a handle as a parameter and operate on that specific file instead of operating on standard input or standard output. Example: putStrLn is a function that takes a string and returns an I/O action that will print out that string to the terminal and a newline after it. hPutStrLn takes a handle and a string and returns an I/O action that will write that string to the file associated with the handle and then put a newline after it. In the same vein, hGetLine takes a handle and returns an I/O action that reads a line from its file.

Loading files and then treating their contents as strings is so common that we have these three nice little functions to make our work even easier.

readFile has a type signature of readFile :: FilePath -> IO String. Remember, FilePath is just a fancy name for String. readFile takes a path to a file and returns an I/O action that will read that file (lazily, of course) and bind its contents to something as a string. It’s usually more handy than doing openFile and binding it to a handle and then doing hGetContents. Here’s how we could have written our previous example with readFile:

import System.IO

main = do
  contents <- readFile "girlfriend.txt"
  putStr contents

Because we don’t get a handle with which to identify our file, we can’t close it manually, so Haskell does that for us when we use readFile.

writeFile has a type of writeFile :: FilePath -> String -> IO (). It takes a path to a file and a string to write to that file and returns an I/O action that will do the writing. If such a file already exists, it will be stomped down to zero length before being written on. Here’s how to turn girlfriend.txt into a CAPSLOCKED version and write it to girlfriendcaps.txt:


Hey! Hey! You! You!
I don't like your girlfriend!
No way! No way!
I think you need a new one!


import System.IO
import Data.Char

main = do
  contents <- readFile "girlfriend.txt"
  writeFile "girlfriendcaps.txt" (map toUpper contents)

appendFile has a type signature that’s just like writeFile, only appendFile doesn’t truncate the file to zero length if it already exists but it appends stuff to it.

Let’s say we have a file todo.txt that has one task per line that we have to do. Now let’s make a program that takes a line from the standard input and adds that to our to-do list.


import System.IO

main = do
  todoItem <- getLine
  appendFile "todo.txt" (todoItem ++ "\n")

We needed to add the “\n” to the end of each line because getLine doesn’t give us a newline character at the end.

Ooh, one more thing. We talked about how doing contents <- hGetContents handle doesn’t cause the whole file to be read at once and stored in-memory. It’s I/O lazy, so doing this:

main = do
  withFile "something.txt" ReadMode (\handle -> do
    contents <- hGetContents handle
    putStr contents)

is actually like connecting a pipe from the file to the output. Just like you can think of lists as streams, you can also think of files as streams. This will read one line at a time and print it out to the terminal as it goes along. So you may be asking, how wide is this pipe then? How often will the disk be accessed? Well, for text files, the default buffering is line-buffering usually. That means that the smallest part of the file to be read at once is one line. That’s why in this case it actually reads a line, prints it to the output, reads the next line, prints it, etc. For binary files, the default buffering is usually block-buffering. That means that it will read the file chunk by chunk. The chunk size is some size that your operating system thinks is cool.

You can control how exactly buffering is done by using the hSetBuffering function. It takes a handle and a BufferMode and returns an I/O action that sets the buffering. BufferMode is a simple enumeration data type and the possible values it can hold are: NoBuffering, LineBuffering or BlockBuffering (Maybe Int). The Maybe Int is for how big the chunk should be, in bytes. If it’s Nothing, then the operating system determines the chunk size. NoBuffering means that it will be read one character at a time. NoBuffering usually sucks as a buffering mode because it has to access the disk so much.

Here’s our previous piece of code, only it doesn’t read it line by line but reads the whole file in chunks of 2048 bytes.

main = do
  withFile "something.txt" ReadMode (\handle -> do
    hSetBuffering handle $ BlockBuffering (Just 2048)
    contents <- hGetContents handle
    putStr contents)

Reading files in bigger chunks can help if we want to minimize disk access or when our file is actually a slow network resource.

We can also use hFlush, which is a function that takes a handle and returns an I/O action that will flush the buffer of the file associated with the handle. When we’re doing line-buffering, the buffer is flushed after every line. When we’re doing block-buffering, it’s after we’ve read a chunk. It’s also flushed after closing a handle. That means that when we’ve reached a newline character, the reading (or writing) mechanism reports all the data so far. But we can use hFlush to force that reporting of data that has been read so far. After flushing, the data is available to other programs that are running at the same time.

Think of reading a block-buffered file like this: your toilet bowl is set to flush itself after it has one gallon of water inside it. So you start pouring in water and once the gallon mark is reached, that water is automatically flushed and the data in the water that you’ve poured in so far is read. But you can flush the toilet manually too by pressing the button on the toilet. This makes the toilet flush and all the water (data) inside the toilet is read. In case you haven’t noticed, flushing the toilet manually is a metaphor for hFlush. This is not a very great analogy by programming analogy standards, but I wanted a real world object that can be flushed for the punchline. HumorisMiranico? 😯

We already made a program to add a new item to our to-do list in todo.txt, now let’s make a program to remove an item. I’ll just paste the code and then we’ll go over the program together so you see that it’s really easy. We’ll be using a few new functions from System.Directory and one new function from System.IO, but they’ll all be explained.

Anyway, here’s the program for removing an item from todo.txt:


import System.IO
import System.Directory
import Data.List

main = do
  handle <- openFile "todo.txt" ReadMode
  (tempName, tempHandle) <- openTempFile "." "temp"
  contents  show n ++ " - " ++ line) [0..] todoTasks
  putStrLn "These are your TO-DO items:"
  putStr $ unlines numberedTasks
  putStrLn "Which one do you want to delete?"
  numberString <- getLine
  let number = read numberString
      newTodoItems = delete (todoTasks !! number) todoTasks
  hPutStr tempHandle $ unlines newTodoItems
  hClose handle
  hClose tempHandle
  removeFile "todo.txt"
  renameFile tempName "todo.txt"

At first, we just open todo.txt in read mode and bind its handle to handle.

Next up, we use a function that we haven’t met before which is from System.IO — openTempFile. Its name is pretty self-explanatory. It takes a path to a temporary directory and a template name for a file and opens a temporary file. We used "." for the temporary directory, because . denotes the current directory on just about any OS. We used "temp" as the template name for the temporary file, which means that the temporary file will be named temp plus some random characters. It returns an I/O action that makes the temporary file and the result in that I/O action is a pair of values: the name of the temporary file and a handle. We could just open a normal file called todo2.txt or something like that but it’s better practice to use openTempFile so you know you’re probably not overwriting anything.

The reason we didn’t use getCurrentDirectory to get the current directory and then pass it to openTempFile but instead just passed "." to openTempFile is because . refers to the current directory on unix-like system and Windows.

Next up, we bind the contents of todo.txt to contents. Then, split that string into a list of strings, each string one line. So todoTasks is now something like ["Iron the dishes", "Dust the dog", "Shave the yak"]. We zip the numbers from 0 onwards and that list with a function that takes a number, like 3, and a string, like “hey” and returns “3 – hey”, so numberedTasks is ["0 - Iron the dishes", "1 - Dust the dog" …. We join that list of strings into a single newline delimited string with unlines and print that string out to the terminal. Note that instead of doing that, we could have also done mapM putStrLn numberedTasks.

We ask the user which one they want to delete and wait for them to enter a number. Let’s say they want to delete number 1, which is Dust the dog, so they punch in 1. numberString is now “1” and because we want a number, not a string, we run read on that to get 1 and bind that to number.

Remember the delete and !! functions from Data.List. !! returns an element from a list with some index and delete deletes the first occurence of an element in a list and returns a new list without that occurence. (todoTasks !! number) (number is now 1) returns “Dust the dog”. We bind todoTasks without the first occurence of “Dust the dog” to newTodoItems and then join that into a single string with unlines before writing it to the temporary file that we opened. The old file is now unchanged and the temporary file contains all the lines that the old one does, except the one we deleted.

After that we close both the original and the temporary files and then we remove the original one with removeFile, which, as you can see, takes a path to a file and deletes it. After deleting the old todo.txt, we use renameFile to rename the temporary file to todo.txt. Be careful, removeFile and renameFile (which are both in System.Directory by the way) take file paths as their parameters, not handles.

And that’s that! We could have done this in even fewer lines, but we were very careful not to overwrite any existing files and politely asked the operating system to tell us where we can put our temporary file. Let’s give this a go!


Posta un commento o usa questo indirizzo per il trackback.



Inserisci i tuoi dati qui sotto o clicca su un'icona per effettuare l'accesso:

Logo di WordPress.com

Stai commentando usando il tuo account WordPress.com. Chiudi sessione /  Modifica )

Google photo

Stai commentando usando il tuo account Google. Chiudi sessione /  Modifica )

Foto Twitter

Stai commentando usando il tuo account Twitter. Chiudi sessione /  Modifica )

Foto di Facebook

Stai commentando usando il tuo account Facebook. Chiudi sessione /  Modifica )

Connessione a %s...

Questo sito utilizza Akismet per ridurre lo spam. Scopri come vengono elaborati i dati derivati dai commenti.

%d blogger hanno fatto clic su Mi Piace per questo: