back to article Two things will survive a nuclear holocaust: Cockroaches and crafty URLs like ғасеьоок.com

It's been known for a long while that people can use similar-looking non-Roman characters to create internet addresses that look similar to real ones. These dishonest URLs have been doing the rounds for years. And, sadly, the abuse of homographs to craft dodgy web addresses continues well into this day, according to security …

  1. Tannin

    Fix it in-browser

    I'd be happy just to have a browser setting that rejects all non-standard addresses, or at least warns the user. By "non-standard" in this context I mean:

    (a) anything that isn't 100% Roman characters. (Other people who use other languages would require different settings, of course. That's fine. I'm just saying what would work for me, and probably you too.)

    (b) anything that doesn't have a standard TLD. Yes to (e.g.) .com, .net.nz, .co.uk, .org.ca, and .gov.au. No to rubbish TLDs like .smile and .biz and .anybloodything 'coz I have never yet seen a useful site on, for example, .shop and wouldn't miss it. Ever.

    A router setting would be even better, but less likely and more cumbersome. A simple browser setting would do. And if by any chance I really want to go to a site with a weirdo address one day, I could always uses a different browser or (better) have a way to allow exceptions.

    (Yes, yes, non-Roman addresses have perfectly valid uses. No argument there. But for something like 90% of us English speakers, these uses do not apply. They only cause trouble. Doubtless a similar comment would apply, with appropriate modifications to, for example, Spanish speakers.)

    1. ukgnome
      Trollface

      Re: Fix it in-browser

      (b) anything that doesn't have a standard TLD

      But how will i know i am getting the good porn if xxx is blocked.

      1. Tannin
        Flame

        Re: Fix it in-browser

        "But how will i know i am getting the good porn if xxx is blocked."

        Here in Victoria on this summer Friday it was 40-odd degrees for much of the day, and still a sweaty 30-odd now. Frankly, just at this moment, I'm more interested in a cold XXXX than a hot XXX.

      2. Voland's right hand Silver badge

        Re: Fix it in-browser

        But is it xxx or ххх?

        The first xxx above is "x" latin 3 times. The second is Russian/Bulgarian/Serbian "h" 3 times.

        1. Fruit and Nutcase Silver badge
          Thumb Up

          Re: Fix it in-browser

          @Voland's right hand

          The first xxx above is "x" latin 3 times. The second is Russian/Bulgarian/Serbian "h" 3 times.

          Works either way...

          "hhh" - hot, hot, hot

    2. ibmalone

      Re: Fix it in-browser

      (a) anything that isn't 100% Roman characters. (Other people who use other languages would require different settings, of course. That's fine. I'm just saying what would work for me, and probably you too.)

      While "error on anything not ascii" is rarely a good solution, and here would cause localisation nightmares, a slightly different method may work: mixed alphabets are very rare, so trigger a security warning in the address bar if encountering, for example, mixed Latin and Cyrillic, and highlight the different code blocks in some way (cookie would be first choice, but needs an accessible aspect by default too).

      1. veti Silver badge

        Re: Fix it in-browser

        "Disable the code that renders domains as normal words" looks like a pretty good solution to me.

        I've suggested in the past, requiring a separate browser window for each character set you want to use. So if you want to browse in English, Russian and Hebrew simultaneously, you'd need three separate windows. Why not?

        1. Anonymous Coward
          Anonymous Coward

          Re: Fix it in-browser

          "I've suggested in the past, requiring a separate browser window for each character set you want to use."

          There goes online language teaching.

        2. Voland's right hand Silver badge

          Re: Fix it in-browser

          So if you want to browse in English, Russian and Hebrew simultaneously, you'd need three separate windows. Why not?

          Probably an overkill. However, highlighting characters which do not belong in a page according to the page declared language and encoding is not a bad idea. If nothing is declared, anything non-Latin ASCii should be highlighted in blinking red :)

          1. J 3
            Alien

            Re: Fix it in-browser

            "highlighting characters which do not belong in a page according to the page declared language and encoding is not a bad idea"

            Probably overkill too. Should be done only if said characters are in a link (e.g. external href and mailto), no?

          2. Charles 9

            Re: Fix it in-browser

            And if it's an intentionally polyglot page that uses UTF-8 as its character set?

          3. Adrian 4

            Re: Fix it in-browser

            "So if you want to browse in English, Russian and Hebrew simultaneously, you'd need three separate windows. Why not?"

            Edge case. The foremost cause of feeping creaturism. Have an option to permit it rather than inflict it on the rest of us.

          4. P. Lee

            Re: Fix it in-browser

            Why not fix it with dns?

            The TLD has a record which sets the encoding and deviations are flagged in the browser for the URL.

      2. GrapeBunch

        Re: Fix it in-browser

        trigger a security warning in the address bar if encountering, for example, mixed Latin and Cyrillic,

        Your heart is in the right place, but

        РЕАСЕ.СОМ

        for example, is 100% Cyrillic.

        My wife got one of these just last week "but they're going to close our account!". Just like the main story, it had English syntax errors and the URL had for example the Russian К rather than the Latin K. Delete, now! For the Люб of Бог !

    3. Yes Me Silver badge
      Headmaster

      Re: Fix it in-browser

      Tannin is correct that this problem needs to be handled in the user interface. It's simply not possible to fix it in the DNS protocol, or to a large extent in DNS registration policies. Technically, the problem is pretty much indistinguishable from "Don't be evil."

      Consider that human writing systems have evolved over thousands of years, and none of them was designed for the Internet. There isn't space in this box to write a technical essay, but here is a sequence of such essays if people want gory details:

      RFC4290

      RFC4690

      RFC5890

      RFC5891

      RFC5894

      RFC5992

      RFC6912

    4. aberglas

      I'll not need non-ascii

      Since it has become an important tenant of font design to make "I" almost exactly the same shape as "l".

  2. TonyJ
    Joke

    Joke

    The version of that joke I heard recently was the three things that will survive a nuclear holocaust would be bacteria, cockroaches and the DFS sale

    (one for Brits...for our cousins across the pond, DFS is a furniture store that seems to have a never ending sale of one kind or another being advertised)

  3. jake Silver badge

    Eh?

    What about tardigrades?

    And Keith Richards, of course.

    1. Anonymous Coward
      Joke

      Re: Eh?

      Keith Richards body died in the sixties, it's just his brain never got the message.

      1. jake Silver badge

        Re: Eh?

        Other way 'round.

    2. Anonymous Coward
      Anonymous Coward

      Re: Eh?

      Keith Richards cannot be killed by conventional weapons...

  4. Qwertius

    Non-Roman...?

    Non-Bleedin Roman ?

    Listen mate call me what I am.

    I Kike - a Yid - I Hate the Bloody Romans.... a lot.

    1. Roy Nottroy

      Well what did they ever do for us?

  5. Cuddles

    Is it actually a threat?

    "internet engineers just don't believe it's that much of an threat."

    Which seems to be a fair point really. Most people are perfectly happy to click on any random link that comes along, even if it's obviously a scam. There are plenty of ways available to obfuscate link destinations already in common use, including simple miss-spellings, while URL shortening sites are used by many legitimate businesses and hide the destination entirely. As long as people are happy to be fooled by a wide variety of methods, assuming they bother to check the destination address at all, non-Roman characters adding one more way obfuscation can be done is a pretty minor issue overall.

  6. IGnatius T Foobar
    FAIL

    Facebook FAIL

    Based on the kind of hellhole the real Facebook has turned into, the site at ғасеьоок.com might actually be a big improvement.

    1. Anonymous Coward
      Anonymous Coward

      Re: Facebook FAIL

      No, alas. Anybody who doesn't know that ь never follows a vowel (as in ғасеьоок) is functionally illiterate.

      1. Julian Bradfield

        Re: Facebook FAIL

        Voyna i Mor: ь follows vowels in some orthographies, e.g. Chechen in Cyrillic. But even there, it doesn't follow e.

        1. Adrian 4

          Re: Facebook FAIL

          ғасеьоок.com. reads as 'racebook.com' to me.

  7. J 3
    Coat

    "finally bother to address the issue"

    I see what you did there...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like