c++ - Fast 'group by/count' std::vector<std::u16string> into a std::map<u16string, int> -


i have function reads ~10000 words vector, want group words map 'count' how many times word appears.

while code 'works' can take 2 seconds re-build map.

nb: unfortunately, cannot change 'read' function, have work vector of std::u16string.

std::vector<std::u16string> vvalues; vvalues.push_back( ... ) ...  std::map<std::u16string, int> mvalues; for( auto = vvalues.begin(); != vvalues.end(); ++it ) {   if( mvalues.find( *it ) == mvalues.end() )   {     mvalues[*it] = 1;   }   else   {     ++mvalues[*it];   } } 

how speed 'group by' while keeping track of number of times word appears in vector?

if call std::map::operator[] on new key, value of key value initialized (to 0 pods int). so, loop can simplified to:

for (auto = vvalues.begin(); != vvalues.end(); ++it)     ++mvalues[*it]; 

if there no key *it, default value 0, incremented immediately, , becomes 1.

if key exists, incremented.

furthermore, doesn't need map ordered, can use std::unordered_map instead, insertion average constant time, instead of logarithmic, speed further.


Comments