Scalabium Software

SMExport advertising

Knowledge for your independence'.
Home Delphi and C++Builder tips


#162: How can I extract the plain text from html-formatted string?

Today I want to publish small procedure that extract the plain text from html-formatted string

function StripHTMLTags(const strHTML: string): string;
var
  P: PChar;
  InTag: Boolean;
  i, intResultLength: Integer;
begin
  P := PChar(strHTML);
  Result := '';

  InTag := False;
  repeat
    case P^ of
      '<': InTag := True;
      '>': InTag := False;
      #13, #10: ; {do nothing}
      else
        if not InTag then
        begin
          if (P^ in [#9, #32]) and ((P+1)^ in [#10, #13, #32, #9, '<']) then
          else
            Result := Result + P^;
        end;
    end;
    Inc(P);
  until (P^ = #0);

  {convert system characters}
  Result := StringReplace(Result, '&quot;', '"',  [rfReplaceAll]);
  Result := StringReplace(Result, '&apos;', '''', [rfReplaceAll]);
  Result := StringReplace(Result, '&gt;',   '>',  [rfReplaceAll]);
  Result := StringReplace(Result, '&lt;',   '<',  [rfReplaceAll]);
  Result := StringReplace(Result, '&amp;',  '&',  [rfReplaceAll]);
  {here you may add another symbols from RFC if you need}
end;


Published: April 6, 2004

See also
 
MAPIMail
dBase Viewer
Paradox Password Recovery
DBLoad
DBISAM Viewer
Paradox to MS Access converter
Database Information Manager
ExcelFile Viewer
ABA Document Convert
ABA Database Convert
 
 
Contact to webmaster

 

Borland Software Code Gear Scalabium Delphi tips

Copyright© 1998-2015, Scalabium Software. All rights reserved.
webmaster@scalabium.com

SMReport Autogenerated