• Skip to main content
  • Skip to primary sidebar

Technipages

Smart phone, gadget and computer tutorials

  • Topics
    • Android
    • Gaming
    • Hardware
    • Internet
    • iOS
    • MacOS
    • Office
    • Software
    • Windows
    • Definitions
  • Product Reviews
  • Downloads
  • About
What Are HTML Entities

What Are HTML Entities

Posted on November 12, 2020 by Mel Hawthorne Leave a Comment

HyperText Markup Language, or HTML, is the primary language for web pages on the internet. It includes support for a number of other languages that add extra functionality and styling such as JavaScript and CSS. All of these languages are text-based with some meaningful characters used to separate literal strings that should be printed to the browser and code that should be interpreted and executed.

This design has some issues though, these become obvious when you want to print one of the meaningful characters to the browser. The best example characters to use are the “less than” and “greater than” symbols. Respectively these symbols are used to open and close code segments in HTML. The correct method of printing these characters to the screen safely is to use HTML entities.

HTML entities and security

Thanks to these characters having a special meaning you have to be really careful to make sure that you replace them with the HTML entity version if you want them to be printed to the browser. Unfortunately, many web developers forget that users can submit input to many websites. If this user input includes meaningful characters and they aren’t replaced with HTML entities, in a process called sanitisation, then the website has a Cross-Site Scripting (XSS) vulnerability.

Tip: Don’t try submitting special characters to websites in an attempt to find XSS vulnerabilities. Doing so is technically hacking and is a criminal offence unless you have permission from the owner of the website.

How HTML entities work (and sometimes don’t)

HTML entities work because the browser knows to display it as the relevant special character and to not treat it as a special character. All HTML entities start with an ampersand “&” and end with a semi-colon “;”. Most characters are identified by an entity number although some special characters have a shorthand name too. For example “&”, “<”, and “>” have the entity numbers “&”, “<”, and “>” as well as the entity names “&amp;”, “&lt;”, and “&gt;” respectively. The browser knows that these strings mean it needs to display the relevant characters.

Tip: A full list of character entity names can be found here, although entity name support varies by browser.

In most cases, users should only ever see the characters that HTML entities represent. It’s possible, however, to see encoded characters, commonly ampersand “&”, through a process called “Double encoding”. This happens as the ampersand character appears in its own encoded version. Double encoding generally happens when input is correctly encoded, as it’s submitted, however, when it’s being output it gets sanitised again. This results in the ampersand at the start of the “&amp;” getting encoded a second time and appearing as “&amp;amp;”, the browser then correctly interprets that as a string that should be printed as “&amp;” having decoded the HTML entity and ignored the partial entity.

You Might Also Like

  • How to Change View to HTML or Plain Text in Outlook 2019, 2016, and 365
    How to Change View to HTML or Plain Text in Outlook…

Filed Under: Internet

Reader Interactions

Did this help? Let us know! Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Recent Posts

  • Microsoft Teams Disconnects Bluetooth Headphones
  • Fix Skype Error: Exchange Needs Your Credentials
  • Fix Skype Notifications Not Working on Windows 10
  • Teams in Outlook: We Couldn’t Schedule the Meeting
  • VR Oculus Quest 2: How to Configure a New Room-Scale Boundary
  • VR Oculus Quest 2: How to Adjust Boundary Sensitivity
  • Dropbox: How To Change the Date Format
  • Microsoft Teams: There Was a Problem Saving the Photo

Who’s Behind Technipages?

Baby and Daddy My name is Mitch Bartlett. I've been working in technology for over 20 years in a wide range of tech jobs from Tech Support to Software Testing. I started this site as a technical guide for myself and it has grown into what I hope is a useful reference for all.

Follow me on Twitter, or visit my personal blog.

You May Also Like

  • Html 4.01

© Copyright 2021 Technipages · All Rights Reserved · Privacy