back to article It has been 15 years, and we're still reporting homograph attacks – web domains that stealthily use non-Latin characters to appear legit

What's old is new again as infosec bods are sounding the alarm over a fresh wave of homoglyph characters being used to lure victims to malicious fake websites. Researchers at Soluble today said they worked with Verisign to thwart the registration of domain names that use homoglyphs – non-Latin characters that look just like …

  1. Tom 38

    Homoglyphs are DISGUSTING

    This is not the brexit I voted for.

    1. Yet Another Anonymous coward Silver badge

      Re: Homoglyphs are DISGUSTING

      Ban Latin characters, what have the romans ever done for us?

      1. MiguelC Silver badge

        Re: Homoglyphs are DISGUSTING

        The aqueduct?

        1. Kabukiwookie

          Re: Homoglyphs are DISGUSTING

          The sanitation?

          1. Claptrap314 Silver badge

            Re: Homoglyphs are DISGUSTING

            The roads?

            1. Anonymous Coward
              Anonymous Coward

              Re: Homoglyphs are DISGUSTING

              And public order. Remember how this place used to be before that?

      2. Mark 85

        Re: Homoglyphs are DISGUSTING

        Pizza and pasta in all it's glorious types. Oh... wines.

    2. bombastic bob Silver badge

      Re: Homoglyphs are DISGUSTING

      what if they identify as ISO8859-1 ??

  2. Anonymous Coward
    Anonymous Coward

    A þorny problem, to be sure

    Homograph attacks? They must be taking the þiss...

    1. elkster88

      Re: A þorny problem, to be sure

      Þú, feld, núna.

      1. Anonymous Coward
        Anonymous Coward

        Re: A þorny problem, to be sure

        Svifnökkvinn minn er fullur af álum.

        1. This post has been deleted by its author

    2. Warm Braw

      Re: A þorny problem, to be sure

      Is that you, Eth?

      1. Captain Badmouth

        Re: A þorny problem, to be sure

        Oh, R0n...

    3. Doctor Syntax Silver badge

      Re: A þorny problem, to be sure

      "They must be taking the þiss."

      What's thiss?

      1. AdamWill

        Re: A þorny problem, to be sure

        I'm not sure, but I think it makes a faint 'whooshing' sound.

  3. katrinab Silver badge
    Paris Hilton

    Surely the answer is

    Have some sort of rule that if it looks the same, it is the same

    For example, if I were to visit, the user agent would convert it to before doing the dns lookup.

    Likewise, it should be possible for the user agent to convert thе before looking it up. (difference in my example is first e is taken from the cyrillic alphabet)

    1. John Robson Silver badge

      Re: Surely the answer is

      so convert based on ocr and predominant script?

      1. AdamWill

        Re: Surely the answer is

        No no, don't be silly! Not OCR. A giant matching table that's carefully hand-maintained, obvs.

        Or, you know, it's the 2020s so I guess vaguely promise to teach an AI to do it?

      2. bombastic bob Silver badge

        Re: Surely the answer is

        right - OCR "homonyms" should all translate to the appropriate charset before name lookups happen, or at least before registrars accept them as non-duplicates.

        And doing periodic name cleanup might be a good idea, requiring takedowns of any domain that's a lookalike (and assuming they're being used for fraud).

        So basically construct a map of UTF-8 chars to ISO8859-1 lookalike chars, then run every domain name through that matrix, see if duplicates show up.

        I assume other-than-english lingos might need something similar.

    2. ThatOne Silver badge

      Re: Surely the answer is

      Doesn't setting "network.IDN_show_punycode=true" in "about:config" help?

    3. Crisp

      Re: if I were to visit


      1. katrinab Silver badge

        Re: if I were to visit

        I suppose it would be the same as yeRegister.

    4. eldakka

      Re: Surely the answer is

      But what if I wanted to go to and not

  4. Hans 1

    Chrome notified me for www.ɡ, www.gooɡ and www.ɡooɡ, Firefox did not ... hm ... Firefox ?

    1. Phil O'Sophical Silver badge

      I just tried it, FF 68.3esr warned me "Deceptive site"

      1. Robert Grant

        FFX72 (or could be 73 - I can't tell which) didn't warn for the first one, at least.

        1. 1752

          Did notice FF73 did not offer 'open in new tab' when highlighting the text.

    2. Anonymous Coward
      Anonymous Coward

      Firefoz automatically changed www.ɡ to

      1. Yes Me Silver badge

        Anything's for sale these days

        But it also took me to a vendor that shall not be named and:

        "ɡ is listed for sale!

        A great domain can be the key to your success"

  5. Doctor Syntax Silver badge

    What a glyph looks like is entirely up to the font in use. There's nothing to stop email clients and browsers defaulting to fonts which make a clear distinction.

    1. Ken Hagan Gold badge

      Not really. For the font to be usable, it has to make glyphs look recognisable to a native user of the script. Scripts, in turn, have a habit of containing (for example) a letter that looks like a small circle. You can't make that look different from another small circle without making at least one of them look wrong.

      On the other hand, a mixture of scripts within the same part of a domain name is almost certainly dodgy, so there does appear to be an easy way for browsers to detect the fraudsters.

      1. Warm Braw

        there does appear to be an easy way for browsers to detect the fraudsters

        I'm not so sure about that. Back in the days of "code pages" numeric values were interpreted in a cultural context to determine the glyph they represented. These days, Unicode code points just represent glyphs, there isn't any real concept of which "script" they represent - related characters are grouped into blocks, but there may be multiple blocks associated with a particular language group - and more can be added over time. Some glyphs are omitted from certain blocks because they already exist elsewhere. And there's no reason to impose a restriction on any domain that it can only use the glyphs common in one particular culture - combining, say, Chinese characters with "arabic" numerals. There have also been complaints that Unicode fails adequately to distinguish superficially similar glyphs from different languages - particularly in Asia. Simple rules won't help you work out which Unicode strings are likely to be deceptive in general: they'll only tell you which ones aren't ASCII.

      2. A random security guy

        The Unicode standard tries very hard to make characters that look the same to map to the same Unicode character across many scripts. e.g. Many Han (CJK) characters belong to both the Japanese and the Chinese character sets and have the same Unicode.

        In fact, most scripts use the same numeral system.

        However, as this article points out, it is still possible to have two very similar looking characters with different codes. It just slips through or it just happened that something that looks like an 'o' also exists in another script. To pull the 'o' from one script into another can create a pockmarked character set making many string operations difficult; (is 'o' < 'p') if 'o' is pulled in from another script?

        There are also political ramifications.

        Pulling in characters from other, similar scripts, can create a sudden rise in the temperature of the injured party. However, if the characters look similar, it is probably because they probably fought a few battles leading to an exchange of ideas and knowledge.

  6. Bachelorette

    No Homo

    I use the browser extension "No Homo Graph" to catch these homo graphs in domain names.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like