Tracks
/
Elixir
Elixir
/
Exercises
/
File Sniffer
File Sniffer

File Sniffer

Learning Exercise

Introduction

Binaries

Elixir provides an elegant syntax for working with binary data as we have seen with the <<>> special form provided for working with bitstrings.

The binary type is a specialization on the bitstring type. Where bitstrings could be of any length (any number of bits), binaries are where the number of bits can be evenly divided by 8. That is, when working with binaries, we often think of things in terms of bytes (8 bits). A byte can represent integer numbers from 0 to 255. It is common to work with byte values in hexadecimal, 0x00 - 0xFF.

Binary literals are defined using the bitstring special form <<>>. When defining a binary literal, we can use integer and string literals. Integer values greater than 255 will overflow and only the last 8 bits of the integer will be used. By default, the ::binary modifier is applied to the value. We can concatenate binaries with the <>/2 operator.

<<255>> == <<0xFF>>
# Overflowing bits are truncated
<<256>> == <<0>>
<<2, 4, 6, 8, 10, 12, 14, 16>> == <<0x02, 0x04, 0x06, 0x08, 0x0A, 0x0C, 0x0E, 0x10>>

A null-byte is another name for <<0>>.

Pattern matching on binary data

Pattern matching is even extended to binaries, and we can pattern match on a portion of binary data much like we could for a list.

# Ignore the first 8 bytes, match and bind the remaining to `body`
<<_::binary-size(8), body::binary>>

Like with other types of pattern matching, we can use this in function signatures to match when selecting from multiple function clauses.

Instructions

You have been working on a project which allows users to upload files to the server to be shared with other users. You have been tasked with writing a function to verify that an upload matches its media type. You do some research and discover that the first few bytes of a file are generally unique to that file type, giving it a sort of signature.

Use the following table for reference:

File type Common extension Media type binary 'signature'
ELF "exe" "application/octet-stream" 0x7F, 0x45, 0x4C, 0x46
BMP "bmp" "image/bmp" 0x42, 0x4D
PNG "png" "image/png" 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A
JPG "jpg" "image/jpg" 0xFF, 0xD8, 0xFF
GIF "gif" "image/gif" 0x47, 0x49, 0x46

1. Given an extension, return the expected media type

Implement the type_from_extension/1 function. It should take a file extension (string) and return the media type (string) or nil if the extension does not match with the expected ones.

FileSniffer.type_from_extension("exe")
# => "application/octet-stream"

FileSniffer.type_from_extension("txt")
# => nil

2. Given a binary file, return the expected media type

Implement the type_from_binary/1 function. It should take a file (binary) and return the media type (string) or nil if the extension does not match with the expected ones.

file = File.read!("application.exe")
FileSniffer.type_from_binary(file)
# => "application/octet-stream"

file = File.read!("example.txt")
FileSniffer.type_from_binary(file)
# => nil

Don't worry about reading the file as a binary. Assume that has been done for you and is provided by the tests as an argument.

3. Given an extension and a binary file, verify that the file matches the expected type

Implement the verify/2 function. It should take a file (binary) and extension (string) and return an :ok or :error tuple.

file = File.read!("application.exe")

FileSniffer.verify(file, "exe")
# => {:ok, "application/octet-stream"}

FileSniffer.verify(file, "png")
# => {:error, "Warning, file format and file extension do not match."}
Edit via GitHub The link opens in a new window or tab
Elixir Exercism

Ready to start File Sniffer?

Sign up to Exercism to learn and master Elixir with 57 concepts, 158 exercises, and real human mentoring, all for free.