Re: replace() function

From: Byron C. Darrah <bdarr_at_sse.FU.HAC.COM_at_hypermail-project.org>
Date: Fri, 9 Oct 1998 14:07:50 -0700 (PDT)
Message-Id: <199810092107.OAA03350_at_pepperoni.pizza.hac.com>

Good morning,

I did a little more checking on this, so here's the skinny if you're interested.

I noticed that some attention to this bug was indeed paid in an early version of the new hypermail. However, the "fix" was more of a work-around than a real fix...

The problem tends to manifest itself most often for single-character substitutions. (Eg: replacing single-character '%' with "%25"). So the work-around provided was to implement a whole seperate version of
"replace()", called "replacechar()" that is used for certain
single-character substitutions.

However, "replace()" is still used for some other subsitutions.
"replace()", as it is now, is still buggy and will blow up anytime it is
used to replace an old string with a new one that happens to contain the old one. For example, if trying to replace "$SUBJECT" with the string
"Re: Hypermail crashes when $SUBJECT is used in the subject" :-).

--Byron Darrah

PS: On the Y2K subject. If anyone wants a fairly simple date class for C or C++ that can add, subtract, convert to text, and parse dates reliably, I've got one at http://www.cs.ucla.edu/~darrah/date_t.tgz that you can try. Not to be confused with the dates.c file in hypermail, which does other stuff. My date class is "guranteed" to correctly account for century boundaries, leap years, quad leap years, leap centuries, quad leap centuries, and the Papal decree in 1752, or your money back :-).



Date: Fri, 9 Oct 1998 10:57:36 -0700 (PDT) From: "Byron C. Darrah" <bdarr_at_sed.hac.com> Sender: owner-hypermail_at_landfield.com Precedence: bulk
Reply-To: "Byron C. Darrah" <bdarr_at_sed.hac.com>

Hi, I noticed a bug in an old variant version of hypermail so I inspected the sources of the latest beta release and it seems to be there, too. I'm a little surprised that it hasn't been caught by now.

Anyway, there's a function called replace() in strings.c that is used to do things like variable substitution and url encoding. (eg: replace occurrences of "$TO" with an email address.) Well, this function is written such that if the new text *contains* the pattern you are replacing, you get incorrect results. In fact, you get infinite recursion. For example, try to replace "%" with "%25" (for url-encoding), and you end up

with an endless sequence of "%%%%%%%%%%%%%%%%%%%%%%%...".

I noticed the new version is slightly better than my crummy old deviant version because it uses strcpymax() instead of strcpy() at one place, but I think the basic problem is still there.

Here is a possible replacement for replace().

Besides the main problem, replace() is tail-recursive which is unnecessary, and also tends to do some unnecessary recopying of a static buffer onto itself due to the recursion. So I cleaned that up a bit, too.

--Byron Darrah

/* Given a string, replaces all instances of "oldpiece" with "newpiece".
 *
 * Modified this routine to eliminate recursion and to avoid infinite
 * expansion of string when newpiece contains oldpiece.  --Byron
*/

char *replace(char *string, char *oldpiece, char *newpiece) {

   int str_index, newstr_index, oldpiece_index, end,

      new_len, old_len, cpy_len;
   char *c;
   static char newstring[MAXLINE];

   if ((c = (char *) strstr(string, oldpiece)) == NULL)

      return string;

   new_len        = strlen(newpiece);
   old_len        = strlen(oldpiece);
   end            = strlen(string)   - old_len;
   oldpiece_index = c - string;

   newstr_index = 0;
   str_index = 0;
   while(str_index <= end && c != NULL)
   {

      /* Copy characters from the left of matched pattern occurence */
      cpy_len = oldpiece_index-str_index;
      strncpy(newstring+newstr_index, string+str_index, cpy_len);
      newstr_index += cpy_len;
      str_index    += cpy_len;

      /* Copy replacement characters instead of matched pattern */
      strcpy(newstring+newstr_index, newpiece);
      newstr_index += new_len;
      str_index    += old_len;

      /* Check for another pattern match */
      if((c = (char *) strstr(string+str_index, oldpiece)) != NULL)
         oldpiece_index = c - string;

   }
   /* Copy remaining characters from the right of last matched pattern */    strcpy(newstring+newstr_index, string+str_index);

   return newstring;
} Received on Fri 09 Oct 1998 11:11:49 PM GMT

This archive was generated by hypermail 2.3.0 : Sat 13 Mar 2010 03:46:11 AM GMT GMT