Reading files in C++ can sometimes can be complex process. I wrote an article of four different ways to read a entire file into a string. I outline the pros and cons of each method. I have also done benchmarking to figure out which one is the fastest.

Problem statement: Given a file path (std::string), read the entire file into fileData (std::string).

The four methods I used are C’s fread, C++’s read, rdbuf, and istreambuf_iterator.

A string is not null terminated in the same way a C string is, so we can read binary data into it.

Basic Algorithm Outline

The basic outline for each method is:

  • Open the file
  • Check if the file is opened
  • Determine file size
  • Resize string to file size
  • Read the Data
  • Close file

Open the file

We open the file for reading and binary format. Why binary format? Because I don’t know what type of data you want to open. So I went with binary. Binary format reading just prevents any interpretation of the data. If you have a text file with lots new lines we can read that differently and we can use getline for that.

Check if file is opened

Checking if the file is opened is important because we can handle the situation if the file is not opened properly. For example, if the file is missing or locked by another process. Otherwise, if we perform any operations on the file such as reading, we likely will get an exception and the program could crash.

Determine file size

Determining the file size is important for allocating space for the string.

Resize string to file size

Reszing the string to the file size (or allocating space for the string) is important because re-allocation of memory is generally a slow process. If we open a large file, we want the string to be able to take in all the contents of the file.

Read the data

Finally, read the data into the string. Ideally, we should only need one read with the file size.

Close the file

We will close the file so other processes can use the file again if needed. With the C++ methods, when the file handle goes out of scope it will close the handle but not with the C style. In general, it is good practice to always close the file handle once you’re done with the file.

Benchmarks

I ran some back of the envelope benchmarking to understand the performance for each of the methods. For small files that are opened occionally, the speed of the method matters less. However, for larger files or having to read many files, speed matters.

I tested with 3 runs of 500 MB file and 5 runs of 25 MB file. I normalized the results to the fastest of the results which was C’s fread as 1.0

Method Relative performance
C’s fread 1.0x
read ~2x
rdbuf ~10x
istreambuf_iterator ~20x

fread was fastest, C++’s read was about twice as slow, rdbuf was about 10x slower, and istreambuf_iterator about 20x slower.

rdbuf was about 10x slower because the string has to get copied out of the stringstream, which slowed things down.

I suggest you trying to time your code. Measure then optimize!

C’s fread

Technically not C++, so you may want to skip this method. HOwever, if you are in need for speed, then this is the fastest.

⚠️ Check ftell result before casting to an unsigned type because it can return -1 if something goes wrong, otherwise casting -1 to unsigned will give max unsigned number, and that’s not something you want to allocate!

void doRead()
{
    std::string fileData;

    // Open the file
    FILE* f = fopen(filePath.c_str(), "rb");

    // Check if file opened
    if (!f)
    {
        // Handle error
        std::println("Unable to open file");
        return;
    }

    // Determine file size
    fseek(f, 0, SEEK_END);

    long ftellResult = ftell(f);

    if (ftellResult < 0)
    {
        // handle error
        std::println("ftell failed");
        return;
    }

    rewind(f);
    
    size_t fileSize = static_cast<size_t>(ftellResult);

    // Resize string to file size
    fileData.resize(fileSize);

    // Read data
    fread(&fileData[0], sizeof(char), fileSize, f);

    // Close file
    fclose(f);
}

C++’s read

C++’s read. From my understanding, this is a wrapper over the C’s file I/O operations, providing more error handling. For some reason, it cuts the performance in half though!

ℹ️ Nifty trick is to open the file at the end to get its file size with tellg by using std::ios::ate flag when opening the file.

⚠️ Similar to ftell, check tellg result before casting to an unsigned type because it can return -1 if something goes wrong, otherwise casting -1 to unsigned will give max unsigned number, and that’s not something you want to allocate!

void doRead()
{
    std::string fileData;

    // Open file
    // seek to the end with std::ios::ate
    std::ifstream file(filePath, std::ios::in | std::ios::binary | std::ios::ate);

    // Check if file opened
    if (!file.is_open())
    {
        // handle error
        std::println("Unable to open file");
        return;
    }

    size_t fileSize;

    // Determine file size (method 1)
    auto tellgResult = file.tellg();

    if (tellgResult == -1)
    {
        // handle error
        std::println("file_size failed");
        return;
    }

    fileSize = static_cast<size_t>(tellgResult);

    file.seekg(0, std::ios_base::beg);
    
    // Resize string
    fileData.resize(fileSize);

    // Read file into string
    file.read(fileData.data(), fileSize);

    // Close file
    file.close();
}

ℹ️ We can alternatively determine file size with std::filesystem:

// Determine file size (method 2)
namespace fs = std::filesystem;
std::error_code errorCode;

//fs::path path = filePath;
auto file_sizeResult = fs::file_size(filePath, errorCode);
if (errorCode)
{
    // handle error
    std::println("file_size failed");
    return;
}

Rdbuf method

Alternatively, we can use rdbuf to read the file into a string stream. However, it requires an extra copy to a string, significantly reducing performance 😢. This method is good if you need to use a string steam instead.

void doRead()
{
    // Open file
    std::ifstream in(filePath, std::ios::binary);
    std::ostringstream stringStream;

    // Check if file opened
    if (!in.is_open())
    {
        // Handle error
        std::println("Unable to open file");
        return;
    }

    // Read file
    stringStream << in.rdbuf();

    // Copy string stream to string
    std::string fileData = stringStream.str();

    // Close file
    in.close();
}

istreambuf_iterator method

This is by far the slowest method, however, the least amount of code.

void doRead()
{
    // Open file
    std::ifstream ifs = std::ifstream(filePath, std::ios::binary);

    // Check if file opened
    if (!ifs.is_open())
    {
        // Handle error
        std::println("Unable to open file");
        return;
    }

    // Read file into string
    std::string fileData(std::istreambuf_iterator<char>{ifs}, {});

    // Close file
    ifs.close();
}

If you want to throw caution to the wind, and not check the file opened correctly or manually close the file we can write the following one liner:

std::string fileData(std::istreambuf_iterator<char>{std::ifstream(filePath, std::ios::binary)}, {});

Summary and Recommendations

We discussed four different ways to read an entire file into a string. We discussed the pro and con for each and their performance.

I recommend for you to benchmark (i.e., measure) your file I/O methods and determine the best for your situation.

From my data, the fastest method was C’s fread however, it is not technically C++ code. C++’s read method is the fastest C++ method. I recommend not using any method that you have to do additional copies or cannot allocate your string up front, as these are costly operations.