TransWikia.com

Find an Illegal String

Code Golf Asked by nneonneo on October 27, 2021

The challenge is to find a string of characters that cannot appear in any legal program in your programming language of choice. That includes comments, strings, or other “non-executable” parts.

Challenge

  • Your program may be specific to a particular version or implementation of your language’s compiler/interpreter/runtime environment. If so, please specify the particulars.
  • Only standard compiler/interpreter/runtime options are permitted. You cannot pass some weird flag to your compiler to get a specific result (e.g. passing a flag to convert warnings into errors).
  • If your programming language requires a specific encoding (e.g. UTF-8), your string must also be correctly encoded (i.e. strings which fail solely due to character decoding errors are not allowed).
  • Every individual character in your submission must be admissible in a legal program; that is, you can’t just use a character which is always rejected.
  • The compiler/interpreter/runtime must give an error when given any source code that contains your string as a substring. The error does not have to be the same across programs – one embedding of your string might cause a syntax error, while another might cause a runtime error.

Scoring

  • Shortest illegal string for each language wins.
  • You should explain why your string is illegal (why it cannot appear anywhere in a legal program).
  • Dispute incorrect solutions in the comments. More specifically, you should provide a link to TIO or equivalent demonstrating a legal program (i.e. one that doesn’t produce any errors) that contains the proposed substring.
  • Some languages (e.g. Bash, Batch, Perl) allow arbitrary binary data to be appended to a program without affecting validity (e.g. using __DATA__ in Perl). For such languages, you may submit a solution that can appear only in such a trailing section. Make sure to make a note of that in your answer. (The definition of this “trailing section” is language-dependent, but generally means any text after the parser has entirely stopped reading the script).

Example

In Python, I might submit

x
"""
'''

but this can be embedded into the larger program

"""
x
"""
'''
y
'''

so it isn’t admissible.

56 Answers

Swift, 4 bytes

There's control characters in here, so I'll just give their ASCII values: 0A 2F 2A 00.

Most of the ways that other C-family languages do this don't work in Swift, because Swift allows nested block comments.

The null character always generates at least a warning. However, if it's in the middle of whitespace or a line comment, it's only a warning, not an error. Additionally, Xcode lets you put most control characters inside of a block comment and just rolls with it.

Emphasis on most. For some reason, if you have a null byte in the middle of a block comment, the Swift compiler gets lost and can't find the end of the block comment, no matter where it is. This causes an error (and also breaks Xcode's syntax highlighting). Similarly, you can't have a null byte in a raw multiline string literal, so I don't have to worry about those.

So, this is four bytes:

  • 0A -- a newline, to make sure we're not in a line comment.
  • 2F 2A -- /*, the start of a block comment.
  • 00 -- a null byte, to prevent the block comment or multiline raw string from terminating.

Answered by Bbrk24 on October 27, 2021

Pip Classic, 10 bytes


;"
;`
*;

Try it online!

I think this should be solid, but I welcome anyone to prove me wrong.

How?

A semicolon at the start of a line begins a single-line comment, so our code is effectively

*;

and Pip complains because * is a binary operator and needs a left operand.

If we add any expression before the snippet, we still get an error because * needs a right operand. No operand can be supplied because * is followed by the expression terminator ;.

We can't wrap the snippet in a string, because every possible string delimiter (", ", and `) is matched by a delimiter in the snippet. (In a regular "-delimited string, the backslash is not an escape character, so it doesn't mess anything up.)

Finally, we can't comment out the last line of the snippet because Pip Classic doesn't have block comments.

Answered by DLosc on October 27, 2021

Google Sheets, Excel; 3 bytes

For a single-cell Formula (starts with + or =). Otherwise, you can just put whatever arbitrary text you want into a cell.

Here it is: 1"1

  • A string has to be enclosed in "s. A double quote must be escaped by "".
  • The " has to be a string token boundary.
  • If it's the end, following a string with a 1 is illegal.
  • If it's the beginning, preceding a string with a 1 is also illegal.

Answered by Calculuswhiz on October 27, 2021

Dis, 2 bytes.

))

How it works

  • ( to the nearest ) is a comment.
  • But nothing can match the second ) above in this syntax.
  • Thus it is syntax error.

Answered by nrgmsbki4spot1 on October 27, 2021

Rust, 65540 65538 bytes

r"##### ... a lot more #s ... ###`

That is a literal carriage return (not r- that's there as a placeholder because every time I try to copy a single carriage return I wind up with just a newline on the clipboard) followed by 65535 #s followed by a backtick. In rust, the only way to come at this challenge is to break out of any comments or strings, and then cause a syntax error. The comments are easy, carriage returns not followed by newlines are disallowed in comments. Regular string literals are easy too, just adding an unescaped " will do it. Raw string literals are much harder and initially had me thinking this challenge is impossible in rust. Raw string literals do not have escapes and can be surrounded, and terminated by any number of #s. For example, the literal r####"ab"###"c"#### is perfectly valid rust and evaluates to ab"###"c. However, when I looked at rustc's lexer, I noticed something interesting: The lexer disallows raw string literals with more than 65535 #s as delimiters. That means I just have to add a lot of #s and tack on a backtick to the end, which unconditionally errors if it isn't in a comment or string literal.

edit: my previous solution was invalid because block comments can be nested. This is slightly golfier anyway.

Answered by Aiden4 on October 27, 2021

Desmos, 3 bytes




(linefeed, backslash, linefeed)

I recently said "Pasting in invalid-formatted text [into Desmos] simply does nothing." As I recently discovered, this isn't quite true. When pasting in multiple lines, it's happy to throw errors if one of them is invalid. Now, this works even if some of those lines other are blank, causing them to be ignored. Therefore, we make a on its own line, which is guaranteed to throw an error.

Logically, we should be able to knock this down to two bytes (, LF), as no legal line can end with a . However, Desmos strangely interprets this as an empty list (normally represented as []), for reasons I don't understand. This makes that two-byte string an unusual almost-illegal string, where the only valid program containing that string is the string itself.

Answered by Ethan Chapman on October 27, 2021

NDBall 2 bytes

specificly in NDBallSim V1.0.1

##

a hashtag is a memory cell instructor and requires a direction, two of them anywhere will cause the error

NDBall Parse ERROR @ LINE (whatever line): Memory cell requires a direction ex: #>12

this double character trick can actually be done with many more chars, to be exact all of these )(,}{|><+-pP$%Y][E and newline

This is actually all (discounting digits) of the useable chars in the lang itself because it never has a case where you use the same char twice

Answered by Aspwil enson on October 27, 2021

Zig 0.6.0, 2 bytes


`

(n`)

A newline will reset the tokenizer state, escaping any comments or multiline strings and erroring for single line strings. Backticks are not a valid character except in strings.

Try it online!

Answered by pfg on October 27, 2021

Scratch (1.x, except 1.2 beta), scratchblocks syntax, 26 bytes


when gf clicked
say(()/()

Leading new line ensures that "when gf clicked" will not be in a comment, so that what's below it will run.

This errors when run in the Stage, because the Stage cannot use the say block.

This errors when run in a sprite by itself, because a divide by zero is attempted. (When a number argument is blank, it is read as 0.)

This errors when run in a sprite when ) or a new line is added after it, because it doesn't change what the existing code does.

This errors when run in a sprite when something else is added after it, because it creates an undefined block which causes an error when run.


1.2 beta is excluded because it featured comment blocks (different from modern comments), which supported multiline, which this could be put into.

Versions past 1.4, including the Experimental Viewer, are excluded because dividing by zero does not cause an error.

Answered by qarz on October 27, 2021

Jelly, 3 bytes

»
€

Try it online!

Jelly has no comments, and every line in the code is parsed whether or not it is reachable. The only way to prevent some code from being executed is starting a string literal using . This is countered by », which terminates a string (and interprets it as a dictionary-compressed string).

Thus (each) will always be executed on a link (line) of its own. It tries to get a link from its left; since none exists, the interpreter errors.

Answered by fireflame241 on October 27, 2021

naz, 2 bytes

0d

This command will attempt to divide the register by zero, throwing the error division by zero.

Also 2 bytes:

0p

This command will attempt to find the remainder of dividing the register by zero, which will also throw the error division by zero.

Answered by sporeball on October 27, 2021

JavaScript, 7 bytes

 */
#`#
  • Adding a // at the beginning will still not work because of the leading newline, leaving the second line uncommented.

  • Adding a /* will not uncomment the string completely because of the closing */ that completes it, leaving the # exposed.

  • Adding ` will not quote the string completely, because of ` that completes a string, leaving the # exposed.

  • Regular expressions won't work because of # following / character.

  • / following * cannot be parsed as a regular expression, as regular expressions cannot have newlines

Try it!

clicky.onclick=a=>{console.clear();console.log(eval(before.value+" */n#`#"+after.value));}
textarea {
  font-family: monospace;
}
<button onclick="console.clear();">Clear console</button>
<br>
<textarea id=before placeholder="before string"></textarea>
<pre><code> */
#`#</code></pre>
<textarea id=after placeholder="after string"></textarea>
<br>
<button id=clicky>Evaluate</button>

Answered by Konrad Borowski on October 27, 2021

Unreadable, 2 bytes

''

Try it online!

All Unreadable commands must be of the form '""…", with one ' followed by 1 to 10 "s. Having two successive 's anywhere in the program leads to error: parser failed: invalid command (0): '.

Answered by Robin Ryder on October 27, 2021

Milky Way, 2 bytes

Tries to execute an undefined opcode. Milky Way does not have comments. The newline is for ending strings.


)

Try it online!

Answered by user85052 on October 27, 2021

Keg, 3 4 bytes

Fixed a for loop bug noted by @Jono2906


)ø.

Try it online!

Explanation

n    Terminate a line-comment
)     End a for loop
 ø    Clear the stack
  .   Try to print the TOS item, which will create an error to the program.

Answered by user85052 on October 27, 2021

Turing Machine Code 5 bytes

Assuming block editing isn't allowed:

0

Or with the symbols showing:

< cr >< lf >
0
< cr >< lf >

Without block editing, it is impossible to stick this behind the comment symbol (';'), as the '0' will end up on the next line anyway. There is no block commenting in Turing Machine Code, a fact that is taken advantage of in other answers here as well. This patch of code would not only not run, it would kill the whole program before it can begin to execute, no matter where it is placed.

Try it online!

Answered by ouflak on October 27, 2021

Shakespeare Programming Language, 2 bytes

.:

Explanation: If this string is in the title of the play, the . ends it and the : is not a valid character name. Similar problems occur in an act and scene name. No character can speak a line beginning with :, and the . will end a Recall statement, which can otherwise create a comment.

Answered by Hello Goodbye on October 27, 2021

Go, 6 bytes


*/```

Try to crack it online!

The grave accent (`) marks a raw string literal, inside which all characters except `, including newlines and backslashes, are interpreted literally as part of the string. Three `'s in a row are the core: adjacent string literals are invalid and ` always closes a ` string, so there's no way to make sense of them. I had to use 3 more bytes for anti-circumvention, a newline so we can't be inside a single-line comment or a normal quoted string, and a */ so we can't be inside a multi-line comment.

Answered by Purple P on October 27, 2021

INTERCAL, 12 bytes

DOTRYAGAINDO

Try to crack it online!

INTERCAL's approach to syntax errors is a bit special. Essentially, an invalid statement won't actually error unless the program tries to execute it. In fact, the idiomatic syntax for comments is to start them with PLEASE NOTE, which really just starts a statement, declares that it isn't to be executed, and then begins it with the letter E. If your code has DODO in the middle of it, you could prepend DOABSTAINFROM(1)(1) and tack any valid statement onto the end and you'll be fine, if it's DODODO you can just bend execution around it as (1)DON'TDODODOCOMEFROM(1). Even though INTERCAL lacks string literal syntax for escaping them, there's no way to use syntax errors to create an illegal string, even exhausting every possible line number with (1)DO(2)DO...(65535)DODODO, since it seems that it's plenty possible to have duplicate line numbers with COME FROM working with any of them.

To make an illegal string, we actually need to use a perfectly valid statement: TRY AGAIN. Even if it doesn't get executed, it strictly must be the last statement in a program if it's in the program at all. 12 bytes is, to my knowledge, the shortest an illegal string can get using TRY AGAIN, because it needs to guarantee that there is a statement after it (executed or not) so DOTRYAGAIN is just normal code, and it needs to make sure that the entire statement is indeed TRY AGAIN, so TRYAGAINDO doesn't work because it can easily be turned into an ignored, normal syntax error: DON'TRYAGAINDOGIVEUP, or PLEASE DO NOT TRY TO USE TRYAGAINDO NOT THAT IT WOULD WORK. No matter what you put on either side of DOTRYAGAINDO, you'll error, with either ICL993I I GAVE UP LONG AGO, ICL079I PROGRAMMER IS INSUFFICIENTLY POLITE, or ICL099I PROGRAMMER IS OVERLY POLITE.

Answered by Unrelated String on October 27, 2021

Runic Enchantments, 3 bytes

One of many possible variations.

Try it online!

Runic utilizes unicode combining characters in a "M modifies the behavior of C" (where C is a command). As such, no two modifiers are allowed to modify the same command and the parser will throw an error if such an occurrence is found.

Similarly, certain commands that redirect the IP cannot be modified in any way, due to the existence of direction modifying modifier characters (and both in the same cell makes no sense).

There is no way to escape or literal-ize the string to make it valid. Tio link contains a ; in order to bypass the higher-priority "no terminator" error.

Answered by Draco18s no longer trusts SE on October 27, 2021

Ink, 5 bytes


*/{}

Leading newline ends single-line comments.
*/ ends multi-line comments. Thanks to the leading newline, you can't put a / in front of it to make it the start of a comment rather than the end of one.
{ and } enclose things meant to be parsed, rather than simply printed. If there's nothing between them, the compiler gets sad because it Expected some kind of logic, conditional or sequence within braces: { ... } but saw '}'. This happens even inside string literals, so there's no need to check if we're inside one of those.

Answered by Sara J on October 27, 2021

ECMAScript Regex, 4 bytes

]](+

Try it on regex101

This is a quite a bit easier than PCRE. There's no Q...E, no free-spacing mode, and no comments. But if we used just ](+ we could still be inside a character class and have our ] escaped, as [](+] which would be treated as a character class of ](+. So we still need the double ] to make sure we exit any character class we may have been in, which works even if a range was started, e.g. [!-]](+.

(+ is illegal in any context other than a character class, and will give an error message such as "Nothing to repeat" or "Incomplete group structure" / "The preceding token is not quantifiable".

Answered by Deadcode on October 27, 2021

PCRE Regex, 6 7 bytes

E
)](+

Try it on regex101

Any string not containing E would be legal inside a Q...E literal sequence. By starting this one with E, we break out of such a sequence if we were in one. And if we weren't in one, but are preceded by a , then it will be treated as a literal E, and we'll still be guaranteed not to be inside a Q...E.

(+ is not part of any legal group structure, and will generate a compile-time error ("Incomplete group structure" and/or "The preceding token is not quantifiable" / "quantifier does not follow a repeatable item") anywhere other than:

  1. Within a Q...E. We've handled this. We can't be inside one thanks to the E.
  2. Immediately after a . But since we have it immediately after a E, it can't be immediately after a . Note that E alone is valid, even if there was no Q before it.
  3. Inside a #... style comment. This can only happen in free-spacing mode, but this can be turned on by (?x) anywhere in a regex, so we need to handle it.
  4. Inside a (?#...) style comment.
  5. Inside a character class, e.g. [(+] or [QE(+] – or even [QE](+], which will be treated as [](+], which, since an empty character class is not part of PCRE syntax (except in PCRE2 with the PCRE2_ALLOW_EMPTY_CLASS option enabled), is treated as a character class consisting of ](+ (the beginning of a character class is the only place where ] does not need to be escaped).

Because of #3, we need a newline, to break out of any #... comment we may be in.

Because of #4, we need a ), to break out of any (?#...) comment we may be in.

And it is because of #5 that we need to put ] in front of the (+. This closes any character class we might have been in. If we hadn't already put a newline and/or a ) to close a potential comment, we'd need this to be ]], because a character class can't be empty and ] is allowed to be the first character in a class without being escaped. In any case, thanks to having a character before our ], it even works if a range was started, e.g. [!-E)](+.

Edit: Silly me, didn't protect it from being inside comments. Fixed.

Answered by Deadcode on October 27, 2021

Ruby, 58 23 27 bytes AND proof of impossibility at bottom

This snippet is valid in any Ruby version prior to Ruby 2.3 (when heredocs were added):


=end
)}end/;[}'"//;[}#{]}

Old version (invalid):


=end
)}end/;kill(Process.pid,-9)'"//;kill Process.pid,-9

This cannot be used anywhere in a Ruby program except after __END__.

Proof of impossibility

This solution is impossible in any Ruby version after and including Ruby 2.3.

Any Ruby solution can be inserted into this snippet and function as a valid program:

<<'string'

# Insert code here

string

With this particular snippet, you can add (note leading newline)


string

to your solution in order to invalidate the program. However, changing the "name" of the heredoc will again invalidate your solution. Heredocs can have infinite placeholders, meaning a solution accounting for all of them would be infinitely long. Thus an answer in Ruby 2.3+ is impossible.

Thanks to histocrat for pointing this out.

Answered by CG One Handed on October 27, 2021

Husk1, 3 bytes

Try it online!

Explanation

The newlines force to be parsed as a supposed built-in, however since it's not (yet) implemented the parsing fails with unexpected "9674" or an error because of empty lines.

Note: Initially I tried to force an inference failure, but the type-checking is done lazily and one can easily "un-break" programs with adding a valid main function.


1: The code might work at some point in the future. So to be precise any version of Husk as of before the date of this post (ie. at least up to commit 0806b9d).

Answered by ბიმო on October 27, 2021

Scratch (scratchblocks2), 3 bytes

There's no such thing as an error in scratchblocks2 - just red-colored blocks - but this can't be expressed in actual Scratch, so I think it's OK.


<?

Leading newline to avoid this just being commented or ::ed out.

Then a predicate block with it's label starting with ? - there's no such block.

Answered by W. K. on October 27, 2021

Powershell, 10 8 12 14 13 14 16 bytes

-2 byte thanks to Mazzy finding a better way to break it
+4 -1 bytes thanks to IsItGreyOrGray

$#>
'@';
"@";
@=

I hope this works. ' and " to guard against quotes, #> to break the block-comment, new lines to stop the single-line comment, both '@ and "@ to catch another style of strings, and then starts an improper array to throw a syntax error.

The logic being they can't use either set of quotes to get in, they can't block-comment it out, If @" is used, it'll create a here-string which can't have a token afterwards, and if they leave it alone, it'll try to make a broken array. This statement wants to live so hard, I keep finding even more holes in the armor.

Answered by Veskah on October 27, 2021

Rockstar, 4 5 bytes

Crossed out 4 is still 4 :(

)
"""

Rockstar is a very... wordy language.
While " can be used to define a string, such as Put "Hello" into myVar, to my knowledge there is no way for 3 quotes to appear outside of a comment, and the close paren ensures that won't happen either (Comments in Rockstar are enclosed in parentheses, like this).

Rockstar also has a poetic literal syntax, in which punctuation is ignored, so the newline makes sure that the 3 quotes are the start of a line of code, which should always be invalid

Answered by Mayube on October 27, 2021

Literate Haskell, 15 bytes

Repairing a deleted attempt by nimi.


end{code}
5
>

Try it online!

nimi's original attempt is the last two lines, based on Literate Haskell not allowing > style literate code to be on a neighboring line to a literate comment line (5 here). It failed because it can be embedded in a comment in the alternate ("LaTeX") literate coding style:

begin{code}
{-
5
>
-}
end{code}

However, the begin{code} style of Literate Haskell does not nest, neither in itself nor in {- -} multiline comments, so by putting a line with end{code} just before the line with the 5, that workaround fails, and I don't see a different one.

Answered by Ørjan Johansen on October 27, 2021

FRACTRAN, 2 bytes

()

Try it online!

Since FRACTRAN doesn't have any way of including comments or literals (AFAICT), this will always error any valid program, since all valid programs must be a valid fraction, and this string can never be part of a valid fraction.

Answered by Conor O'Brien on October 27, 2021

JavaScript (Node.js), 9 8 bytes

`*/
u`~

Try it online!

I think this should be illegal enough.

Previous JS attempts in other answers


;*/u)

By @Cows quack

As an ES5 answer this should be valid, but in ES6 wrapping the code with a pair of backticks wrecks this. As a result valid ES6 answers must involve backticks.

`
`*/}'"`u!

By @iovoid

This is an improved version involving backticks. However a single / after the code breaks this (It becomes a template literal being multiplied by a regex, useless but syntactically valid.) @Neil made a suggestion that changing ! to ). This should theoretically work because adding / at the end no longer works (due to malformed regex.)

Explanation

`*/
u`~

This is by itself illegal, and also blocks all single and double quotes because those quotes cannot span across lines without a at the end of a line

//`*/
u`~

and

/*`*/
u`~

Blocks comments by introducing illegal escape sequences

``*/
u`~

Blocks initial backtick by introducing non-terminated RegExp literal

console.log`*/
u`~

Blocks tagged template literals by introducing an expected operator between two backticks

Answered by Shieru Asakoto on October 27, 2021

TI-Basic (83+/84+/SE, 24500 bytes)

A

(24500 times)

TI(-83+/84+/SE)-Basic does syntax checking only on statements that it reaches, so even 5000 End statements in a row can be skipped with a Return. This, in contrast, cannot fit into the RAM of a TI-83+/84+/SE, so no program can contain this string. Being a bit conservative with the character count here.

The original TI-83 has 27000 bytes of RAM, so you'll need 27500 As in that case.

TI-Basic (89/Ti/92+/V200, 3 bytes)

"

Newline, quote, newline. The newline closes any comments (and disallows embedding the illegal character in a string, since AFAIK multiline string constants are not allowed), the other newline disallows closing the string, and the quote gives a syntax error.

You can get to 2 bytes with

±

without the newline, but I'm not sure whether this counts because ± is valid only in string constants.

Answered by bb94 on October 27, 2021

AutoHotkey, 5 bytes

` is the escape character. You can only escape a " when assigning it to a variable.

n*/ prevents it from being commented out or assigned to a variable.


*/`"

Answered by nelsontruran on October 27, 2021

SmileBASIC, 2 bytes


!

Nothing continues past the end of a line, so all you need is a line break followed by something which can't be the start of a statement. ! is the logical not operator, but you aren't allowed to ignore the result of an expression, so even something like !10 would be invalid (while X=!10 works, of course)

Similar things will work in any language where everything ends at the end of a line, as long as it parses the code before executing it.

There are a lot of alternative characters that could be used here, so I think it would be more interesting to list the ones that COULD be valid.

@ is the start of a label, for example, @DATA; ( could be part of an expression like (X)=1 which is allowed for some reason; any letter or _ could be a variable name X=1, function call LOCATE 10,2, or keyword WHILE 1; ' is a comment; and ? is short for PRINT.

Answered by 12Me21 on October 27, 2021

JavaScript, 11 characters

`
`*/}'"`u)

The backticks make sure to kill template strings, the quotes get rid of strings, the newline avoid commented lines, the end of comment avoids block comments, and the last backtick and escape (with a ) to avoid appending numbers or /) try to start a invalid string.

Try it online!

Answered by iovoid on October 27, 2021

Ly, 6 bytes


""{)

(note the leading newline)

The newline prevents line comments, Ly doesn't have block comments, the "" ensures that all open string literals will close, and the unmatched brackets raise the error.

Answered by LyricLy on October 27, 2021

Gaia, 3 bytes


#“

Try it online!

Each line in Gaia is a separate function, so the newline ensures that the code starts at the beginning of a function. Even putting a newline in a string literal will start a new function, since Gaia allows omitting closing quotes. In addition, all functions are parsed before execution, so adding additional functions below won't help.

The # is a meta, which has to directly follow an operator. At the start of the function, there is no operator, so it's a syntax error.

The is an opening quote for string literals. It's there because Gaia also allows omitting the opening quote of strings at the start of a function. If this opening quote wasn't here, you could write #” which is entirely legal.

Answered by Business Cat on October 27, 2021

Ada - 2 bytes

I think this should work:


_

That's newline-underscore. Newline terminates comments and isn't allowed in a string. An underscore cannot follow whitespace; it used to be allowed only after letters and numbers, but the introduction of Unicode made things complicated.

Answered by xaambru on October 27, 2021

COBOL (GNU), 8 bytes


THEGAME

First, a linefeed to prevent you from putting my word in a commented line.

Then, historically, COBOL programs were printed on coding sheets, the compiler relies heavily on 80-character limited lines, there are no multiline comments and the first 6 characters are comments (often used as editable line numbers), you can put almost anything there, AFAIK. I chose THEGAM at the beginning of the next line.

Then, the 7th symbol in any line only accepts a very restricted list of characters : Space (no effect), Asterisk (comments the rest of the line), Hyphen, Slash, there may be others, but certainly not E.

The error given by GnuCobol, for instance, is :

error: invalid indicator 'E' at column 7

Try it online!

Also, you just lost the game.

Answered by PhilDenfer on October 27, 2021

Commodore 64 Basic, 2 bytes


B

(that's a newline followed by the letter "B").

Any line in a Commodore 64 program must begin with either a line number or a BASIC keyword, and stored programs only permit line numbers. There are no keywords beginning with "B" (or "H", "J", "K", "Q", "X", "Y", or "Z").

Answered by Mark on October 27, 2021

x86 32-bit machine code, 11 bytes (and future-proof 64-bit)

90 90 90 90 90 90 90 90 90 0f 0b

This is times 9 nop / ud2. It's basically a NOP sled, so it still runs as 0 or more nops and then ud2 to raise an exception, regardless of how many of the 0x90 bytes were consumed as operands to a preceding opcode. Other single-byte instructions (like times 9 xchg eax, ecx) would work, too.

x86 64-bit machine code, 10 bytes (current CPUs)

There are some 1-byte illegal instructions in 64-bit mode, until some future ISA extension repurposes them as prefixes or parts of multi-byte opcodes in 64-bit mode only, separate from their meaning in 32-bit mode. 0x0e is push cs in 32-bit mode, but illegal on current CPUs (tested on Intel Skylake) in 64-bit.

0e 0e 0e 0e 0e 0e 0e 0e 0e 0e

Rules interpretation for executable machine code:

  • The bytes can't be jumped over (like the "not parsed" restriction), because CPUs don't raise exceptions until they actually try to decode/execute (non-speculatively).

  • Illegal means always raises an exception, for example an illegal-instruction exception. (Real programs can catch that with an exception handler on bare metal, or install an OS signal handler, but I think this captures the spirit of the challenge.)


It works because a shorter byte-string ending in ud2 could appear as an imm32 and/or part of the addressing mode for another instruction, or split across a pair of instructions. It's easiest to think about this in terms of what you could put before the string to "consume" the bytes as part of an instruction, and leave something that won't fault.

I think an instruction can consume at most 9 bytes of arbitrary stuff: a SIB byte, a disp32, and an imm32. i.e. the first 2 bytes of this instruction can consume 8 NOPs and a ud2, but not 9.

c7 84 4b 00 04 00 00 78 56 34 12        mov dword [rbx+rcx*2+0x400],0x12345678

Can't beat 9 nops:

    db 0xc7, 0x84   ; opcode + mod/rm byte: consumes 9 bytes (SIB + disp32 + imm32)
    times 9 nop          ; 1-byte xchg eax, ecx or whatever works, too
    ud2
  ----
   b:   c7 84 90 90 90 90 90 90 90 90 90        mov    DWORD PTR [rax+rdx*4-0x6f6f6f70],0x90909090
  16:   0f 0b                   ud2    

64-bit mode:

 c7 84 0e 0e 0e 0e 0e 0e 0e 0e 0e        mov    DWORD PTR [rsi+rcx*1+0xe0e0e0e],0xe0e0e0e
 0e                      (bad)  

But the bytes for 8 NOPs + ud2 (or times 9 db 0x0e) can appear as part of other insns:

    db 0xc7, 0x84   ; defender's opcode + mod/rm that consumes 9 bytes

    times 8 nop          ; attacker code
    ud2

    times 10 nop    ;; defenders's padding to be consumed by the 0b opcode (2nd half of ud2)
----
  18:   c7 84 90 90 90 90 90 90 90 90 0f        mov    DWORD PTR [rax+rdx*4-0x6f6f6f70],0xf909090
  23:   0b 90 90 90 90 90       or     edx,DWORD PTR [rax-0x6f6f6f70]
  29:   90                      nop
  2a:   90                      nop
  ...

Answered by Peter Cordes on October 27, 2021

VBA, 2 Bytes

A linefeed followed by an underscore - the _ functions as the line continuation character in VBA, and as there is nothing in the line directly to the left or above the line continuation, coupled with VBA's lack of multiline comments means that this will always throw the compile time error Compile Error: Invalid character


_

Answered by Taylor Raine on October 27, 2021

Fortran, 14 bytes


end program
e

No multiline comments or preprocessor directives in Fortran.

Answered by Steadybox on October 27, 2021

AWK, 4 bytes



/

Try it online!

Since AWK doesn't have a method to do multi-line comments, need 2 newlines before and 1 after / to prevent commenting out or turning this into a regex, e.g. add 1/. The most common message being `unexpected newline or end of string.

With previous crack

Answered by Robert Benson on October 27, 2021

Brain-Hack (a variation of Brain-Flak), 3 2 bytes

Thanks to Wheat Wizard for pointing out that Brain-Hack doesn't support comments, saving me a byte.

(}

Try it online!

Answered by Riley on October 27, 2021

S.I.L.O.S, 4 bytes

Silos is competitive o/


x+

S.I.L.O.S runs on a two pass interpreter / compiler. Before execution a "compiler" attempts to simplify the source into an array describing the sourc Each line is treated separately. x+a is an assignment operator that will add ea to the value of x and store it into x. However the "compiler" will break. Therefore, we take this string and add a new line before and after ensuring it's on its own line and breaks the compiler.

Try it online!

Answered by Rohan Jhunjhunwala on October 27, 2021

C#, 16 bytes


*/"
#endif<#@#>

Works because:

  • // comment won't work because of the new line
  • /* comment won't work because of the */
  • You can't have constants in the code alone
  • Adding #if false to the start won't work because of the #endif
  • The " closes any string literal
  • The <#@#> is a nameless directive so fails for T4 templates.
  • The new line tricks it so having / at the start won't trick the */

Each variation fails with a compilation error.

Answered by TheLethalCoder on October 27, 2021

CJam, 7 bytes

"e#"
:"

Try it online!

Answered by Erik the Outgolfer on October 27, 2021

APL and MATL and Fortran, 3 bytes


'

Newline, Quote, Newline always throws an error since block comments do not exist:

Answered by Adám on October 27, 2021

Java, 4 bytes

;u;

Try it online!

This is an invalid Unicode escape sequence and will cause an error in the compiler.

error: illegal unicode escape

Answered by user41805 on October 27, 2021

C (clang), 16 bytes

 */
#else
#else

Try it online!

*/ closes any /* comment, and the leading space makes sure we didn’t just start one. The newline closes any // comment and breaks any string literal. Then we cause an #else without #if or #else after #else error (regardless of how many #if 0s we might be inside).

Answered by Anders Kaseorg on October 27, 2021

Free Pascal, 18 bytes


*)}{$else}{$else}

First close all possible comments, then handle conditional compile.

Please comment here if I forgot something.

Answered by tsh on October 27, 2021

JavaScript, 7 bytes


;*/u)

Note the leading newline.

  • u) is an invalid Unicode escape sequence and this is why this string is invalid
  • Adding a // at the beginning will still not work because of the leading newline, leaving the second line uncommented
  • Adding a /* will not uncomment the string completely because of the closing */ that completes it, leaving the u) exposed
  • As stated by @tsh, the bottom line can be turned into a regex by having a / after the string, so by having the ) in front of the u, we can ensure that the regex literal will always be invalid
  • As stated by @asgallant, one could do 1||1(string)/ to avoid having to evaluate the regex. The semi-colon at the beginning of the second line stops that from happening by terminating the expression 1||1 before it hits the second line, thus forcing a SyntaxError with the ;*.

Try it!

clicky.onclick=a=>{console.clear();console.log(eval(before.value+"n;*/\u)"+after.value));}
textarea {
  font-family: monospace;
}
<button onclick="console.clear();">Clear console</button>
<br>
<textarea id=before placeholder="before string"></textarea>
<pre><code>
 */u)</code></pre>
<textarea id=after placeholder="after string"></textarea>
<br>
<button id=clicky>Evaluate</button>

Answered by user41805 on October 27, 2021

Pyth, 6 bytes

¡¡$¡"¡

¡ is an unimplemented character, meaning that if the Pyth parser ever evaluates it, it will error out with a PythParseError. The code ensures this will happen on one of the ¡s.

There are three ways a byte can be present in a Pyth program, and not be parsed: In a string literal (" or .", which are parsed equivalently), in a Python literal ($) and immediately after a .

This code prevents from making it evaluate without error, because that only affects the immediately following byte, and the second ¡ errors.

$ embeds the code within the $s into the compiled Python code directly. I make no assumptions about what might happen there.

If the program reaches this code in a $ context, it will end at the $, and the ¡ just after it will make the parser error. Pyth's Python literals always end at the next $, regardless of what the Python code might be doing.

If the program starts in a " context, the " will make the string end, and the final ¡ will make the parser error.

Answered by isaacg on October 27, 2021

Python, 10 bytes (not cpython)


?"""?'''?

Edit:

Due to @feersum's diligence in finding obscure ways to break the Python interpreter, this answer is completely invalidated for any typical cpython environment as far as I can tell! (Python 2 and 3 for both Windows and Linux) I do still believe that these cracks will not work for Pypy on any platform (the only other Python implementation I have tested).

Edit 2:

In the comments @EdgyNerd has found this crack taking advantage of a non-ascii encoding declaration! This seems to decode to print(""). I don't know exactly how this was found but I imagine the way to fix this sort of exploit would maybe be to try different combinations of any invalid characters where the ?s are, and find one that doesn't behave well with any encoding.


Note the leading newline. Cannot be commented out due to the newline, and no combination of triple quoted strings should work if I thought about this correctly.

@feersum in the comments seems to have completely broken any cpython program on Windows as far as I can tell by adding the 0x1A character to the beginning of a file. It seems that maybe (?) this is due to the way this character is handled by the operating system, apparently being a translated to an EOF as it passes through stdin because of some legacy DOS standard.

In a very real sense this isn't an issue with python but with the operating system. If you create a python script that reads the file and uses the builtin compile on it, it gives the more expected behavior of throwing a syntax error. Pypy (which probably does just this internally) also throws an error.

Answered by KSab on October 27, 2021

Changeling, 2 bytes




That's two linefeeds. Valid Changeling must always form a perfect square of printable ASCII characters, so it cannot contain two linefeeds in a row.

The error is always a parser error and always the same:

This shape is unpleasant.

accompanied by exit code 1.

Try it online!

Answered by Dennis on October 27, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP