Some usefull C++ snippets to manipulate strings

Had your computer crash on you, or a website shows wrong, or your printer went dead on you? Come on in...

Moderator: Crew

Post Reply
User avatar
Maz
Admin emeritus
Posts: 1938
Joined: Thu Mar 16, 2006 21:11
Location: In the deepest ShadowS
Contact:

Some usefull C++ snippets to manipulate strings

Post by Maz »

DISCLAIMER!

Snippets I provide may not have been fully tested. If you find any errors, please report them to me at
Mazziesaccount (_a_t_) gmail . com

1. st:

Cutting string into pieces based on some delimiter string.
(like php explode() function, but written in c++ )

Code: Select all

/* This one 'explodes' a string (cuts it into pieces based on some matchstring) and puts pieces into a vector. arguments are the string to be cutted, matchstring and the vector where to store pieces. It returns amount of pieces, and if no matchstring is found it returns -1. */


#include <iostream>
#include <vector>
#include <string>

using std::string;
using std::vector;


int explode(string tear_me,string cut,vector<string> &store)
{
        unsigned int start=0;
        int len=0;
        int i=0;
        vector<string> temp;

        len=cut.length();
        if(tear_me.find(cut)==string::npos)
        {
                return -1;
        }
        while( (start=tear_me.find(cut))!=string::npos)
        {
                temp.push_back(tear_me.substr(0,start));
                tear_me.erase(0,start+len);
                i++;
        }
        temp.push_back(tear_me);
        store=temp;
        return i;
}
Attachments
explode.zip
(544 Bytes) Downloaded 585 times
User avatar
Maz
Admin emeritus
Posts: 1938
Joined: Thu Mar 16, 2006 21:11
Location: In the deepest ShadowS
Contact:

Post by Maz »

DISCLAIMER!

Snippets I provide may not have been fully tested. If you find any errors, please report them to me at
Mazziesaccount (_a_t_) gmail . com

Integer to array conversion with C++ ( Well, integer to C++ style string in this case)
itoa() is not included in all compiler's standard librarys... So here's one possible itoa for you (integer to array conversion, in this case integer to string though)

Code: Select all

/* transforms integer to string array. Arguments are integer to be transformed, and base of system (binary, 10, hexadecimal...) Works only bases from 1 to 16 */

#include <iostream>
#include <string>

using std::string;



string itoa(int value, unsigned int base)
{

        const char digits[] = "0123456789abcdef";
        string result,minus;


        if (base==0||base>16) {
                result="nogoodbase";
                return result;
        }

        // negative int?:


        int signedvalue = value;

        // Check for case when input is zero:
        if (signedvalue == 0) return "0";
        if (value < 0) {
                signedvalue = -value;
                minus = "-";
        }

        // Translating number to string with base:
        for (int i = 30; signedvalue && i ; --i) {
                result=digits[ signedvalue % base ]+result;
                signedvalue /= base;
        }
        return minus.append(result);
}
Attachments
itoa.zip
(598 Bytes) Downloaded 587 times
User avatar
mistergreen77
Tycoon
Posts: 269
Joined: Fri Mar 31, 2006 2:09
Location: Brisbane

Post by mistergreen77 »

DISCLAIMER!

I have compiled and used this program but make no claims to quality or reliability. I am self-taught when it comes to programming. Use at your own risk.

Small C program to replace strings in a file with another string of arbitrary length. I wrote it to use in scripts for updating references in html files.

Code: Select all


#include <stdio.h>

#include <stdlib.h>

#include <string.h>
 

/* A C program by David L */

 

int main(int argc, char *argv[])

{

 

  if (argc != 5) {

           printf("%s%s%s",

                  "usage: replace filename1 string1 string2 filename2\n",

                  "Replace instances of string1 in filename1 with string2\n",

                  "and write the results to filename2\n"); 

                  exit(1); }

 

  FILE *ifp, *ofp;

  char *searchString = argv[2], *replaceString = argv[3];

  int searchSize = strlen(searchString);

  int replaceSize = strlen(replaceString);

  char searchResult[searchSize];

    

  ifp = fopen(argv[1],"r");

  ofp = fopen(argv[4],"w");

 

  // base case - beginning of the search

  int i, j, x, count = 0;

    for (i = 0; i < searchSize; ++i) searchResult[i] = fgetc(ifp);

 

  if (!strcmp(searchString, searchResult)) { count++;

     for (j=0; j<searchSize; ++j) {

             x = fgetc(ifp);

        strncpy(searchResult, &searchResult[1], searchSize-1);

        searchResult[searchSize-1] = x;

        searchResult[searchSize] = '\0'; }

     fputs(replaceString, ofp); fputc(searchResult[0],ofp); 

     } else

     fputc(searchResult[0],ofp);

 

  // the search loop

  while (!feof(ifp)) {

        x = fgetc(ifp);

        strncpy(searchResult, &searchResult[1], searchSize-1);

        searchResult[searchSize-1] = x;

        searchResult[searchSize] = '\0';

//        printf ("%s %s\n",searchResult, searchString);

  if (!strcmp(searchString, searchResult)) { count++;

     for (j=0; j<searchSize; ++j) {

             x = fgetc(ifp);

        strncpy(searchResult, &searchResult[1], searchSize-1);

        searchResult[searchSize-1] = x;

        searchResult[searchSize] = '\0'; }

     fputs(replaceString, ofp); fputc(searchResult[0],ofp); 

     } else

     fputc(searchResult[0],ofp); } // end loop

  for (i=1; i<searchSize; ++i) fputc(searchResult[i],ofp);

  printf("Replaced %d instances of %s with %s.\n",count,searchString,replaceString);

  return 0;

} 
 
[size=84][color=green]“Everything should be made as simple as possible, but not one bit simpler.”[/color] - Einstein

[color=green]“There is always some madness in love. But there is also always some reason in madness.”[/color] - Nietzsche[/size]

:twisted: [url=http://forum.connect-webdesign.dk/viewtopic.php?p=5411#5411]Society of Sinister Minds.[/url]
User avatar
Maz
Admin emeritus
Posts: 1938
Joined: Thu Mar 16, 2006 21:11
Location: In the deepest ShadowS
Contact:

Post by Maz »

I admitt MrG that I did not test your program here, but... Have you tested your program in cases when:

1. your search string is lets say foo, and the file you're editing ends with fo
2. Your file does not have newline (\n) as last character?

I think there might be a problem with your while loop. (As I said, I did not test your program, but often with loops like yours result a little bug)

You do have condition !feof(ifp) in while loop. Still you do read file inside the loop, and perform operations BEFORE checking the condition again. At least if you did this by reading a whole line from file, instead a character, you would end up in problems. (mm.. now that I think youve probably avoided that problem by reading only one char)

Bah. Finally I'll get into the point. Following is one of the most common mistakes I have seen, so even if MrG's program would be ok, it's good to write this up..

EOF problem:

Example:
(I'll write this with C++ strings, but you can probably think analogous C function)

Code: Select all

.
.
.
while(!readfile.eof())
{
    getline(stringvariable,readfile);
    writefile << stringvariable+"\n";
}
readfile.close();
writefile.close();
return EXIT_SUCCESS;
Can you spot the flaw in reasoning here?

I'll tell..

1.st you check if eof was found
2. you read line
3. you write the line
4. you check for eof...

Now, when you read the eof from file, you still perform step 3 before checking for eof. This way you end up writing last line one time extra. You can avoid this by checking for eof AFTER you read the line, but BEFORE writing it into a file. For example:

Code: Select all

bool ok=true;
.
.
.
while(ok)
{
     getline(strvar,readfile);
     if(!readfile.good()) ok=false;
     else
     {
          writefile << strvar+"\n";
     }
}
readfile.close();
writefile.close();
return EXIT_SUCCESS;
User avatar
mistergreen77
Tycoon
Posts: 269
Joined: Fri Mar 31, 2006 2:09
Location: Brisbane

Post by mistergreen77 »

Thanks for pointing this out. I taught myself C from a text book so it is good to get feedback.
1. your search string is lets say foo, and the file you're editing ends with fo
2. Your file does not have newline (\n) as last character?
Good questions, I will try re-writing it with your improved loop construct and also test it for these cases.
[size=84][color=green]“Everything should be made as simple as possible, but not one bit simpler.”[/color] - Einstein

[color=green]“There is always some madness in love. But there is also always some reason in madness.”[/color] - Nietzsche[/size]

:twisted: [url=http://forum.connect-webdesign.dk/viewtopic.php?p=5411#5411]Society of Sinister Minds.[/url]
User avatar
mistergreen77
Tycoon
Posts: 269
Joined: Fri Mar 31, 2006 2:09
Location: Brisbane

Post by mistergreen77 »

Hmmm...well I tested both cases and it worked okay except for a mystery byte tacked on the end of the output file. I don't know why I didn't notice this before. When I reach the eof marker I still had searchSize number of characters left to write to output so I wrote


for (i=1; i<searchSize; ++i) fputc(searchResult,ofp);

to flush the contents of searchResult to ofp
when I should have wrote

for (i=1; i<searchSize-1; ++i) fputc(searchResult,ofp);


I understand now exactly what you mean about the potential for bugs in a loop constructed this way but you are also right that because I am reading one char at a time there is no bug. I actually felt that I had to read in with fgetc because
lets say I am looking for foo_bar and I get seven characters at a time:

xxxfoo_

and next

barxxx

there I would have missed a match!


I would say this is a simplistic approach to the problem - I am sure there is a more elegant way to do it. Especially when I used
strncpy(searchResult, &searchResult[1], searchSize-1); to drop the first character from the searchResult. That is what I came up with but I feel like there must be a better way. Maybe if I had grabbed a chunk of memory off the heap I could have just used some fancy use of pointers to achieve the same result.
[size=84][color=green]“Everything should be made as simple as possible, but not one bit simpler.”[/color] - Einstein

[color=green]“There is always some madness in love. But there is also always some reason in madness.”[/color] - Nietzsche[/size]

:twisted: [url=http://forum.connect-webdesign.dk/viewtopic.php?p=5411#5411]Society of Sinister Minds.[/url]
User avatar
Maz
Admin emeritus
Posts: 1938
Joined: Thu Mar 16, 2006 21:11
Location: In the deepest ShadowS
Contact:

Post by Maz »

I did not mean to criticize your program mate :)
I think fgetc() is just fine way to do that, when using the plain vanilla C :) If one wishes to replace strings only when they're in one line, one can of course read the whole line to the memory for example with fgets(), and then examine it in a loop (like I did with my first desperate attempt, which may still be visible somewhere at OG... And that's terrible coding :D ) but I am not so sure it speeds up things too much :p

Nowadays I doubt I could do code like yours too quickly.. I would fell in every possible trap since I'm used to C++ and STL library it provides. I do not know about elegance, nor performance, but with C++ that would have been hel* of a lot easier to do :D :D

Errmm... I think I should not spam this thread... But I still want to tell why C++ eases this task so much..

1. strings. C++ alternative for char arrays (you can still use char arrays, but actually there's no need.)
- inbuilt memory allocation -> no overflows! : security + MUCH easier to use & avoid segfaults (You can ALWAYS do foo+="barbarbarbarbarbar..." or foo="barbarbarbarbar..." if your computer has enough free memory :D Of course if you store foo="bar", you cannot read/assign value from/in foo[3] though.)
- overloaded operators like +, +=, and ==. (+to glue strings, or a string and a char array!, += glue and store in original string, == compare 2 strings)
-LOTS of functions to ease string handling. For example int find(string,[int position where to start search]) : see the previous code ;) erase(startingpoint,number of chars), substr(startingpoint,number of elements), length() etc...

2. vectors
like 2 dimensional arrays. You can do vector<string> foo, or vector<int> foo, or vector<whatever_even_your_own_object_type> foo. Works like arrays, but the memory allocation is once again inbuilt! you can always add new 'row' in vector, by simply using foo.push_back(what_to_push_back)
-similarly to string, vectors have bunch of functions to play with...

Now to demosnstrate the 'power' of strings, I'll show one way with pseudocode how to handle this with C++. Youll see MrG, that you do want to start using C++ too, especially when plain C is almost entirely included in C++ ;)

Code: Select all

1. read from file untill the end of file (for example line by line, if you use getline(foo,filestream) remember to add "\n" at the end of every line read, since getline does not store the "\n"), and store the results either in one large string, or in vector. Next do a loop to search the matchString

while( (position=(int)filecontents.find(matchString) )!=(int)string::npos)
{
//let's store the beginning of filecontent's that stays unchanged in string temp
    temp=filecontents.substr(0,position);
//then we'll erase the whole beginning of filecontents, till the endo of the string we wished to replace.
    filecontents.erase(position,matchstrlen);
//now let's append temp to contain the replacement and the rest of file. 
    temp+=replacementString+filecontents;
//finally store temp back in filecontents
    filecontentes=temp;
}
Of course the abowe example does not work if we wish to replace "foo" with "hfoohjdl" for example. why? ;)

That could be avoided for example by instead of storing the whole file back into the filecontents string, leaving the beginning of filecontent's string to be erased, and growing temp round by round to contain the whole file.

With vectors it would be almost similair, now we could just store each row in individual element of a vector, and examine the vector element by element. In this case it does not benefit us, but sometimes.. no often vectors are more than handy!
User avatar
mistergreen77
Tycoon
Posts: 269
Joined: Fri Mar 31, 2006 2:09
Location: Brisbane

Post by mistergreen77 »

I see this was a c++ thread but since c++ is a superset of c I figured this still qualified. :D
c++ does make a lot of things easier though I think it maybe best suited to larger applications of the sort I would never undertake by myself. Object oriented programming isn't easy for me and it is hard for me to understand the syntax for some of the higher levels of abstraction like virtual functions and inheritance. Perhaps one of these days I will find time to give it some more study.
[size=84][color=green]“Everything should be made as simple as possible, but not one bit simpler.”[/color] - Einstein

[color=green]“There is always some madness in love. But there is also always some reason in madness.”[/color] - Nietzsche[/size]

:twisted: [url=http://forum.connect-webdesign.dk/viewtopic.php?p=5411#5411]Society of Sinister Minds.[/url]
User avatar
Maz
Admin emeritus
Posts: 1938
Joined: Thu Mar 16, 2006 21:11
Location: In the deepest ShadowS
Contact:

Post by Maz »

Actually, Inheritance, virtual functions etc are simple and effective way. It just requires one to do good planning at start. And I bet if you wanted, you would learn them both after trying them out once or twice :) Well, now I'm getting off topic, so I'd better search some old scripts so I can edit these posts to contain some real info :D
User avatar
Maz
Admin emeritus
Posts: 1938
Joined: Thu Mar 16, 2006 21:11
Location: In the deepest ShadowS
Contact:

Post by Maz »

This does not really have anything to do with string manipulation, instead this is a finnish version of a socket usage example :)
Attachments
sock_test.zip
(4.17 KiB) Downloaded 597 times
Post Reply