C++’s substring function substr returns a newly constructed string (copy of a portion of the original string) with the parameters of substr. More importantly, C++’s substring is a source of confusion. I wanted to discuss how the substring works and what are the common pitfalls.

What does Substr do?

substr copies a substring of the original string. A substring is a contiguous sequence of characters within a string. For example, nate is a substring of donate. A substring can also contain the entire original string.

Substring Parameters

string::substr is part of the C++ standard library and is included by #include <string>.

The substring function is defined as string substr(size_t position = 0, size_t length = std::string::npos) const.

Let’s take a closer look at the two parameters position and length, which are a source of confusion. The reason why position and length are source of confusion is because they use different numbering indices. What does this mean? Position is a 0-indexed number and length (e.g., count) is 1-index number. Many developers think that they are a start and end index but this is not the case. Position starts from 0 and goes to string::size()-1, whereas, length is 1-index and goes from 1 to string::size()

Library

In C++, substr is in the string class part of the std namespace. To include string::substr include string.

// include 'string' header
#include <string>

int main()
{
    // using 'std' namespace
    std::string str;

    str.substr(0,2);

    return 0;
}

Example 1: Get Substring of a given string

string word Suppose we have string word = "Donate";.

We want the substring “Do”. What would our arguments be to the function? Answer: word.substr(0,2). Starting at index 0, we want 2 characters (can think of it as “two character count” or “two character length”).

What if we want the substring “nate”, what would our arguments be? Answer: word.substr(2,4).

// positions:  012345
string word = "Donate";

// "Do"
string ans1 = word.substr(0,2);

// "nate"
string ans2 = word.substr(2,4);

Example 2: Get Substring from position to end of string

If we want to get the rest of the substring starting from the position we do:

string sentence = "The quick brown fox jumped over the lazy dog";
//                                              pos 36 ^

// to return "lazy dog"
sentence.substr(36, std::string::npos);

// or simply:
sentence.substr(36);

What is std::string::npos?

std::string::npos is a special value and its exact meaning changes based on context of other string functions. It is defined as -1 unsigned, which becomes the largest unsigned integer type.

std::string::npos represents the largest value of any unsigned type. In the context of substr it is used to represent the largest length (or count) of characters.

Time Complexity

Time complexity is O(n), n=string.size(), in the worst case you copy the whole string so one must traverse through the whole string.

Space complexity is O(n), n=string.size(), similiarly in the worst case, if the whole string has to be copied then the memory usage is the size of the string.

Edge cases

By default, if no paramaters are passed then substr returns the entire string.

If length is greater than the string, it returns as many characters the string has.

Other things to keep in mind

size_t is an unsigned type, be careful with subtracting from position or from length, if result in signed integer would be negative then the unsigned number wraps around.