Uses for special characters in Java code
Overview
Ever wondered how you can write code like this in Java?if( ⁀ ‿ ⁀ == ⁀ ⁔ ⁀ || ¢ + ¢== ₡)
Background
Underscores has long been using in C like language such as Java to distinguish fields and method names.It is common to see a leading underscore like _field or an underscore in a constant like UPPER_CASE. In Java the $ is also used in class names and accessor method names.
The SCJP has notes which state
Identifiers must start with a letter, a currency character ($), or a connecting character such as the underscore ( _ ). Identifiers cannot start with a number!This leads to the question; what other connecting characters are there?
What are connecting characters?
A connecting character joins two words together. This page lists ten connecting charactersU+005F | LOW LINE | _ | view |
U+203F | UNDERTIE | ‿ | view |
U+2040 | CHARACTER TIE | ⁀ | view |
U+2054 | INVERTED UNDERTIE | ⁔ | view |
U+FE33 | PRESENTATION FORM FOR VERTICAL LOW LINE | ︳ | view |
U+FE34 | PRESENTATION FORM FOR VERTICAL WAVY LOW LINE | ︴ | view |
U+FE4D | DASHED LOW LINE | ﹍ | view |
U+FE4E | CENTRELINE LOW LINE | ﹎ | view |
U+FE4F | WAVY LOW LINE | ﹏ | view |
U+FF3F | FULLWIDTH LOW LINE | _ | view |
And if you try the following you may find it compiles.
int _, ‿, ⁀, ⁔, ︳, ︴, ﹍, ﹎, ﹏, _;
While this is interesting, does it have a use? Recently I found one.
I have an object which represents a column, and this column has a value for that row. The names are basically the same but I want a notation to distinguish them. So I have something like
Column<Double>︴tp︴ = table.getColumn("tp", double.class);
double tp = row.getDouble(︴tp︴);
This way I can see with is tp the column, and which is the value.
Interestingly the currency characters are valid as well.
for (int i = Character.MIN_CODE_POINT; i <= Character.MAX_CODE_POINT; i++)
if (Character.isJavaIdentifierStart(i) && !Character.isAlphabetic(i))
System.out.println(i + " : " + (char) i);
if (Character.isJavaIdentifierStart(i) && !Character.isAlphabetic(i))
System.out.println(i + " : " + (char) i);
prints
36 : $
95 : _
162 : ¢
163 : £
164 : ¤
165 : ¥
1547 : ؋
2546 : ৲
2547 : ৳
2555 : ৻
2801 : ૱
3065 : ௹
3647 : ฿
6107 : ៛
8255 : ‿
8256 : ⁀
8276 : ⁔
8352 : ₠
8353 : ₡
8354 : ₢
8355 : ₣
8356 : ₤
8357 : ₥
8358 : ₦
8359 : ₧
8360 : ₨
8361 : ₩
8362 : ₪
8363 : ₫
8364 : €
8365 : ₭
8366 : ₮
8367 : ₯
8368 : ₰
8369 : ₱
8370 : ₲
8371 : ₳
8372 : ₴
8373 : ₵
8374 : ₶
8375 : ₷
8376 : ₸
8377 : ₹
43064 : ꠸
65020 : ﷼
65075 : ︳
65076 : ︴
65101 : ﹍
65102 : ﹎
65103 : ﹏
65129 : ﹩
65284 : $
65343 : _
65504 : ¢
65505 : £
65509 : ¥
65510 : ₩
You can write:
ReplyDeleteif( ⁀ ‿ ⁀ == ⁀ ⁔ ⁀)
Peter, please have mercy to those guys who might ever need to support that code. They would definitely be at risk of literally tearing their hair out. :-)
DeleteI once worked with a legacy project where one class had two methods named '_' and '__' with one calling another. I'd say that was sort of beyond the ordinary experience.
This comment has been removed by the author.
ReplyDeleteOk, but I get "The specified message [10527236] was not found. "
ReplyDeleteThis comment has been removed by the author.
DeleteIt looks like its triggering full GCs because you are running close to your direct memory limit. You can reduce the triggering of GCs by manually freeing the direct blocks, but it appears you are using more than you allowed. I suggest reducing the heap size as you are using ~ 5.3 GB. Try a minimum of 5 and a max of 8 GB and you can increase your direct memory size to say 60 GB. Unfortunately memory profilers are not much help when it comes to direct memory. If you really need this much data I would suggest considering a) compacting the memory by using smaller data types b) memory mapped files so some of the data is can be gracefully swapped to disk. btw memory mapped files don't count towards your maximum direct memory.
DeleteThanks for the suggestion, it is very helpful :) we found out some library jar that we use were allocating a lot of direct memories, we are trying to fix that.
DeleteI took the advice b) instead :) I changed 40G direct memory to memory mapped tmpfs file, will see whether it works. Just double check, if I do not specify MaxDirectMemorySize, JVM should use heap size as upper limit for socket's direct memory allocation, right? Also MappedByteBuffer is not accounted for direct memory?
DeleteYes, I tried Chinese characters before, and they were compiled. Infact I guess identifiers can be any unicode character, just with a non-digit start.
ReplyDeleteSo apart from the keywords, I could generate a java source file completely in a non-english language like Chinese, with class names, all field names all method names and all local variables in Chinese.
Delete