C++’s substring function substr
returns a newly constructed string (copy of a portion of the original string) with the parameters of substr. More importantly, C++’s substring is a source of confusion. I wanted to discuss how the substring works and what are the common pitfalls.
What does Substr do?
substr
copies a substring of the original string. A substring is a contiguous sequence of characters within a string. For example, nate
is a substring of donate
. A substring can also contain the entire original string.
Substring Parameters
string::substr
is part of the C++ standard library and is included by #include <string>
.
The substring function is defined as string substr(size_t position = 0, size_t length = std::string::npos) const
.
Let’s take a closer look at the two parameters position
and length
, which are a source of confusion. The reason why position
and length
are source of confusion is because they use different numbering indices. What does this mean? Position is a 0-indexed number and length (e.g., count) is 1-index number. Many developers think that they are a start and end index but this is not the case. Position starts from 0
and goes to string::size()-1
, whereas, length is 1-index and goes from 1 to string::size()
Library
In C++, substr
is in the string
class part of the std
namespace. To include string::substr
include string
.
// include 'string' header
#include <string>
int main()
{
// using 'std' namespace
std::string str;
str.substr(0,2);
return 0;
}
Example 1: Get Substring of a given string
Suppose we have string word = "Donate";
.
We want the substring “Do”. What would our arguments be to the function? Answer: word.substr(0,2)
. Starting at index 0, we want 2 characters (can think of it as “two character count” or “two character length”).
What if we want the substring “nate”, what would our arguments be? Answer: word.substr(2,4)
.
// positions: 012345
string word = "Donate";
// "Do"
string ans1 = word.substr(0,2);
// "nate"
string ans2 = word.substr(2,4);
Example 2: Get Substring from position to end of string
If we want to get the rest of the substring starting from the position we do:
string sentence = "The quick brown fox jumped over the lazy dog";
// pos 36 ^
// to return "lazy dog"
sentence.substr(36, std::string::npos);
// or simply:
sentence.substr(36);
What is std::string::npos
?
std::string::npos
is a special value and its exact meaning changes based on context of other string functions. It is defined as -1
unsigned, which becomes the largest unsigned integer type.
std::string::npos
represents the largest value of any unsigned type. In the context of substr
it is used to represent the largest length (or count) of characters.
Time Complexity
Time complexity is O(n), n=string.size()
, in the worst case you copy the whole string so one must traverse through the whole string.
Space complexity is O(n), n=string.size()
, similiarly in the worst case, if the whole string has to be copied then the memory usage is the size of the string.
Edge cases
By default, if no paramaters are passed then substr
returns the entire string.
If length is greater than the string, it returns as many characters the string has.
Other things to keep in mind
size_t
is an unsigned type, be careful with subtracting from position or from length, if result in signed integer would be negative then the unsigned number wraps around.