Fixing odd characters

One of the biggest complaints that web developers have is their lack of control over end users.

When will we be able to tell them a) not to paste Micosoft Word or b) if you must, paste it into something that doesn’t do formatting 1st.

It has a nasty habit of converting quotes into “smart” quotes as well as messing up characters such as the euro symbol €.

This is the fix I came up with that allows end users to continue to paste from such programs and bring their weird charset issues with them:

function fixChars($text)
  $chars = array(
    "/xe2x82xac/" => "€",
    "/xe2x80x99/" => "'",
    "/xe2x80x9c/" => "“", // open quotes
    "/xe2x80x9d/" => "”", // close quotes
    "/xe2x80x93/" => "—",
    "/x80/" => "€",
    "/x92/" => "'",
    "/x93/" => "“", // open quotes
    "/x94/" => "”", // close quotes
    "/x96/" => "—",
    "/xa3/" => "£",
  $find = array_keys($chars);
  $replace = array_values($chars);
  return preg_replace($find, $replace, $text);

2 thoughts on “Fixing odd characters”

    1. That looks a little more comprehensive (though missing the UK pound symbol) 🙂
      I did it myself as there is more than one weird set of chars to cope with.
      Could always be extended.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.