PicoToolkit
Extracted data:
View Settings
Applies to real TAB characters.
0 characters
0 without spaces
0 words
0 lines
IndexValue
No matching items found
Spotted a bug or have an idea for a new feature? Let us know here »

HTML stripper

The fastest way to strip HTML tags

The fastest way to strip HTML tags

PicoToolkit's HTML Stripper cleans HTML from pasted text quickly and predictably. Choose how aggressive the tool is: remove all tags and attributes, strip attributes only (keep tags), or use tag and attribute whitelists to preserve only the markup you need. Note: this tool uses whitelists only (no blacklist mode).

How to use

  • Paste your HTML into the input area.
  • Choose a mode:
    • Remove — HTML tags and attributes: delete every tag and its attributes. Output becomes plain text (all tags removed, including <script> and <style>).
    • Attributes only: keep all tags but remove attributes that are not on the attributes whitelist. Tag contents remain (so <script> and <style> blocks are preserved in this mode).
    • Tags whitelist: supply a list of tags to keep (for example: p, a, strong). Any tag not in that whitelist is removed. Because only whitelists exist, you must explicitly list tags you want to preserve.
    • Attrs whitelist: define which attributes are allowed per tag (for example: a[href], img[src|alt]). Any attribute not on the whitelist will be stripped.
  • Preview the result, adjust whitelists or mode, then copy or download cleaned HTML.
  • Processing is limited by browser memory (no hard server-side limit). For very large inputs, process in chunks.

Defaults & security behavior

  • Whitelist-only model: safe-by-default — attributes and tags are removed unless explicitly allowed by your whitelists or by choosing a mode that keeps tags (Attributes only).
  • Script and style handling:
    • In Remove mode: <script> and <style> tags and their contents are removed.
    • In Attributes only mode: tags (including <script> and <style>) are preserved; only attributes are removed unless whitelisted.
    • With Tags whitelist: only tags you list are kept; script/style remain only if you include them.
  • Event-handler attributes (on*) and ARIA attributes are removed unless explicitly allowed in the Attrs whitelist.
  • When allowing href/src values, avoid javascript: and unsafe data: URIs unless you intentionally permit them. Always validate URLs after cleaning.

Real-world examples

1) Remove everything (plain text)

Input:
<div>Hello <a href="http://x">link</a> <img src="i.jpg"></div>
Mode: Remove — HTML tags and attributes
Output:
Hello link i.jpg

2) Strip attributes only (keep tags, remove class/style/on*)

Input:
<p class="lead" style="color:red" onclick="x()">Hi <strong>there</strong></p>
Mode: Attributes only (no attrs whitelisted)
Output:
<p>Hi <strong>there</strong></p>

3) Keep links and images but remove ARIA/analytics attributes

Input:
<a href="http://site.com" onclick="ga()" aria-label="x">Buy</a>
<img src="p.png" alt="pic" data-track="1" aria-hidden="false">
Mode: Attributes only with Attrs whitelist: a[href], img[src|alt]
Output:
<a href="http://site.com">Buy</a>
<img src="p.png" alt="pic">

4) Remove only ARIA attributes but keep all other attributes

Input:
<button aria-pressed="true" class="btn" data-id="123">OK</button>
Mode: Attributes only with Attrs whitelist: button[class|data-id]
Output:
<button class="btn" data-id="123">OK</button>

5) Keep only semantic tags (tags whitelist)

Input:
<div class="wrap"><p>Intro <span class="meta">meta</span></p></div>
Mode: Tags whitelist with tags: p, strong, em
Output:
<p>Intro meta</p>

6) Preserve inline styles intentionally (advanced)

Input:
<h1 style="font-size:24px">Title</h1>
Mode: Attributes only with Attrs whitelist: h1[style]
Output:
<h1 style="font-size:24px">Title</h1>
Note: allowing style can reintroduce layout or hidden-content risk — inspect CSS values.

7) Keep scripts or styles intentionally

Input:
<style>.hid{display:none}</style><script>doEvil()</script>
Mode: Attributes only (tags preserved)
Output:
<style>.hid{display:none}</style><script>doEvil()</script>
Note: scripts/styles remain in Attributes only mode; use Remove mode or exclude those tags via Tags whitelist to eliminate them.

8) Clean scraped HTML but keep simple formatting

Input (scraped):
<div class="article"><h2>News</h2><p>Text <a href="http://x" onclick="x()">link</a></p></div>
Mode: Tags whitelist: h2, p, a  + Attrs whitelist: a[href]
Output:
<h2>News</h2><p>Text <a href="http://x">link</a></p>

Tips & edge cases

  • Because the tool uses whitelists only, explicitly add any tag or attribute you want to keep — otherwise it will be removed in whitelist modes.
  • Data URIs (images or SVG) and style url() values can hide executable content — avoid whitelisting data: URIs unless you trust the source.
  • If you need to remove only lines with certain content, combine this tool with PicoToolkit's filter tool.
  • Malformed HTML is processed by a tolerant parser. Still, check critical outputs (emails, imports) before publishing.

Related tools

Extended FAQ

Does the tool support blacklists (removing specific tags/attributes only)?

No. The tool operates with whitelists only. You must explicitly list tags and attributes you want to keep. This whitelist-first approach reduces accidental retention of unsafe markup.

Are <script> and <style> removed by default?

It depends on the mode:

  • Remove mode: removes all tags and their contents — script/style are removed.
  • Attributes only: does not remove tags, so script/style blocks remain (only attributes are stripped).
  • Tags whitelist: only tags you list are preserved; script/style remain only if you include them.

Always check outputs when keeping script/style content.

 

How do I safely keep links but remove tracking or event attributes?

Use Attributes only mode with an Attrs whitelist that includes a[href] (and specific link attributes like title or rel if needed). That removes onclick and data-* tracking attributes while preserving href values.

Can I preserve inline styles?

Yes — include style in the Attrs whitelist for the specific tag (for example, h1[style]). But keep a visible warning: allowing styles can reintroduce layout or hidden-content issues. Inspect CSS values before using in production.

What about ARIA attributes and accessibility metadata?

ARIA attributes are removed by default unless you add them to the Attrs whitelist. If accessibility metadata is important for your output, include only the specific ARIA attributes you need (for example: button[aria-pressed]).

Is processing limited?

There is no hard server-side limit. Processing is constrained by the browser's available memory. For very large files, break the input into chunks or use a server-side workflow.

PicoToolkit evolves fast. Stay ahead.

Get early access to new tools, features, and productivity upgrades.

We email you occasionally. You can unsubscribe anytime.
© PicoToolkit 2022-2026 All rights reserved. Before using this website read and accept terms of use and privacy policy. Icons by Icons8