Georgian Capital letters added to Unicode. Now what?

Last may Unicode approved 46 Capital letters of Georgian Mkhedruli alphabet.

Maybe it’s a bit early, but operating systems will support this change in future anyway. Out of curiosity I decided to do a little research about what will change for us – developers and I’m sharing it in this article.

Few definitions just in case:
Unicode – A standard, which maps every symbol with an unique number. Also it describes specific rules for different languages. This standard is used all over the technical world and everyone who needs text processing / representing, follows it – Operating systems, platforms, browsers…

UTF-8 – Unicode has the list of symbols and number codes, but it does not care how this information will be stored in memory. There are various encoding algorithms for this. UTF-8 is one of the most popular ones as it optimally uses memory and does not require extra bytes for a symbol which can be fit in just one. Other encoding examples are UCS-2, UTF-16, UTF-32…

Changes in a standard cause changes in implementations, which does not happen immediately. For instance, ₾ Georgian currency symbol was added to 8 version of Unicode on 17 may, 2015 and the Windows update for this symbol was released on 19 January, 2016.

Operating Systems should update keyboard drivers to enable Georgian users use CAPS mode to write Capital letters (there are 33 letters in Georgian, so shift+symbols method is already taken). Also system fonts should be updated, so correct symbols will appear during the font fallback.

Due to the fact, that the Capital and Small versions of the same letter have different codes, software developers usually need some considerations – up until now only for other languages, now for Georgian, too. For instance, when it’s necessary to compare strings, search, match regex patterns, sort, store into the database, etc.

 

Database

MS SQL server has built in Unicode support and during the operations it follows the standard anyway. Just make sure it follows the correct version: SQL Fiddle

It’s different with MySQL – Here each database, table or even a column might have corresponding collation defined, based on what kind of information it stores. We are accustomed to using utf8_general_ci, as it ‘processes’ Georgian letters, too. This collation does not completely implement the Unicode unlike utf8_unicode_ci. Generally, it was being used just because of better performance, however, there is not much difference with modern processors. utf8_unicode_ci will correctly process new Georgian alphabet upon version upgrade.

Here is an example:
Together with the unique codes, Unicode also defines the order of symbols, which is used during sorting. E.g. in this list all kinds of Georgian letter ‘ა’ are listed together – Nuskhuri, Asomtavruli and Mkhedruli. Then versions of ‘ბ’ letter appear. Probably new capital letters will be added in the same way.

SQL Fiddle

CREATE TABLE IF NOT EXISTS `test` (
  `content` varchar(200) NOT NULL
) DEFAULT CHARSET=utf8 COLLATE utf8_general_ci;
INSERT INTO `test` (`content`) VALUES
  ('აბგ'),  ('ააააა'),  ('Ⴁააააა'),  ('Ⴀააააა'),  ('bcd'),  ('ab.'),  ('Ⴄ'),  ('ж'),  ('Ж'),  ('ц'),  ('Ц');
  

CREATE TABLE IF NOT EXISTS `test_better` (
  `content` varchar(200) NOT NULL
) DEFAULT CHARSET=utf8 COLLATE utf8_unicode_ci;
INSERT INTO `test_better` (`content`) VALUES
  ('აბგ'),  ('ააააა'),  ('Ⴁააააა'),  ('Ⴀააააა'),  ('bcd'),  ('ab.'),  ('Ⴄ'),  ('ж'),  ('Ж'),  ('ц'),  ('Ц');


select * from `test` d order by d.content;
select * from `test_better` d order by d.content;

Result:

ab., bcd, Ж, ж, ц, Ц, Ⴀააააა, Ⴁააააა, Ⴄ, ააააა, აბგ
ab., bcd, ж, Ж, ц, Ц, ააააა, Ⴀააააა, აბგ, Ⴁააააა, Ⴄ

The MySQL 8 beta release, which appeared currently, has implemented Unicode version 9, our capital letters are in version 11 🙂

 

Javascript

Although there are many implementations, we can’t ignore V8 anyway, so I’ll discuss based on it.

Javascript has Unicode support, but some things still have problems (e.g. unicode regex). If we need sorting or filtering on our site, than ordinary sort won’t work any more and we should use Locale. Then it will consider Unicode rules. For instance:

let a = ['აბგ','ააააა','Ⴁააააა','Ⴀააააა','bcd','ab.','Ⴄ','ж','Ж','ц','Ц'];
console.log(a.sort());
console.log(a.sort(Intl.Collator('ru').compare));

Unfortunately it does not have support for Georgian collation at all. So, we cannot correctly sort together with Nuskhuri and Asomtavruli. Well, this is a very rare case anyway, so no need to worry. Casual function sorts based on the code points, so it will be according to alphabet (with the exception of capital letters).

That problem with capitals can be solved by converting strings to the same case. Giorgi suggested an idea:

myArray.sort(function(s1,s2){ return s1.toLowerCase() > s2.toLowerCase()}));

Probably it will work correctly for Georgian, too, after V8 renews the Unicode implementation. Currently it works like that for Asomtavruli and Nuskhuri: "Ⴀ".toLowerCase() => "ⴀ"

It seems that, as standard defined Asomtavruli as CAPITAL and Nuskhuri as SMALL, these alphabets are implemented as cases of single one instead of being two completely different alphabets. (v8 source file: unicode.cc, code points are mapped directly.)
Now Mkhedruli is caseless. It’s interesting how it will be marked. I think there is no other language with two kinds of Capital letters.
Anyway, this requires the version upgrade anyway.

Now I remembered, that V8 is an open source project and a volunteer can add Georgian locale. For the time being this results in an empty array:

Intl.Collator.supportedLocalesOf('ka')

 

Java

Java is not in a hurry to upgrade either. JDK 9 with the Unicode 8 implementation (where Lari currency symbol was added) was released after two years – September of 2017.
Here the strings are compared with ‘equals’. In future we’ll need to use the ‘equalsIgnoreCase’ method for Georgian, too:

"Ⴀ".equals("ⴀ")  => false
"Ⴀ".equalsIgnoreCase("ⴀ")  => true

As there is one Capital alphabet already, I’m testing with it. We just don’t use that alphabet generally.

Also, we can’t search with regex directly. Ordinary i – ignore case flag does not work, as Unicode is processed differently. So, we should write:

"A".matches("(?i)[a]")  => true
"Ⴀ".matches("(?i)[ⴀ]") => false

Pattern.compile("[ⴀ]", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE).matcher("Ⴀ").matches();  => true

Correspondingly, we should consider this wherever we use strings – maps, sets, etc.

 

PHP

Generally, working with unicode strings is not pleasant in PHP at all + more conversions will be added here, too.

 

 
We’ll also need change at other place – with very convenient search tools – grep and the similar ones. The case insensitive option of grep does not work for existing Georgian Asomtavruli capital alphabet even now. I hope the Unicode changes will be reflected in their upgrades, too. They are great apps for regex filtering and searching in large (or small) texts and files.

Many of Georgian application systems won’t be able to quickly upgrade their platforms, as testing would take huge amount of time. They will probably add some conversions and validations in front-end, to prevent user input capital strings from appearing in old Java or other systems.

Overall, I like that Capital letters were added (as a result of several persons hard work). It’s an important part of the Georgian language and should not be lost.

Do you have any ideas, what else will need to be changed?

Some resources about the topic:
On.ge – UNICODE-მა ქართული მხედრული ანბანის 46 მთავრული ასონიშანი დაამტკიცა
DevFest 2016: Akaki Razmadze –  ❤  [I LOVE UNICODE]
DevFest 2017: Akaki Razmadze – გუტენბერგი, სტივ ჯობსი, გუგლი, ხინკალი, უნიკოდი
DevFest 2016: Michael Everson – The confusing case history of Georgian in Unicode

Integration tests with databases (Node.js + Mocha)

Automation tests are divided into several categories. To be short, unit tests are used to test small fragments of code. For example, there is a function for formatting a phone number. We might have several unit tests for covering various scenarios, but if we want to check how user performs registration with this number and then passes authorization, we need to cover interaction of several components in our test – this is an integration test (or maybe even acceptance test).

Generally, we are facing an integration test, if it uses:

  • A database
  • A network
  • Any external system (e.g. a mail server)
  • I/O operations

The hard part is that unlike unit tests we cannot run test operations directly on external systems. E.g. we cannot send thousands of test mails to randomly generated addresses. there are several ways to solve this kind of problem depending on what we want to test. let’s look at the options:

 

Service imitation (Stubs, Mocks)

Let’s assume we’re writing a client application, which invokes services on various servers. I.e. our priority is testing a client and no need to actually use production operations. In this case we can create a service stub with exactly same functions and parameters as the real one. only instead of executing the real logic, it will return some fixed responses.

function sendMail(email, content) {
    console.log(‘Email sent to: ‘ + email);
    return true;
}

When we run our app in a test mode, we should make it use the fake service object instead of a real one (Let’s dive into details in future articles).

 

Using the database

Let’s say we are writing a service which heavily uses a database and we need integration tests to check it. clearly we can substitute the database layer with a stub and let select, insert,etc. operations return some predefined fixed values. However in most cases this is not practical and doesn’t really test the relations among various processes. For instance, I would like user to register, activate their account and perform authorization. This flow uses several tables and I would prefer to execute it on the database.

There are several solutions here, too. I prefer to have an empty database separately – neither in-memory, nor a lighter alternative, but exactly the same version of a database, just dedicated to testing. When my app runs in a test mode, it will fetch the test database path from corresponding configuration and will use for test operations. First it will clear the tables to avoid broken state.

I will use Node and Mocha for this example

In my previous post I was describing configuration of various environments. I don’t think of Mocha tests as a different environment, because we might have dev, test and even build servers and tests would be running on all of them. However I will follow the similar method – I’ll use environment variables for configuring testing runtime, too, and I’ll create .env.mocha file.

I’d like to note that the dotenv documentation clearly states – it’s not recommended to have multiple env files like .env, env.test, env.prod, etc, but we should have one .env file with different content on different servers. In my opinion .env.mocha serves completely different purpose and is not included in this rule.

The next step is to use .env.mocha file instead of a real one while app runs in a test mode. Currently there is no working cross-platform code on the internet and I like using Windows OS, so I’m offering my solution, and no need to load configuration in every test file either:

  • Create .env.mocha file in the project directory and configure properly with test values.
  • Create setup.js file under test directory and put this line into it:
    require('dotenv').config({path:__dirname + '/../.env.mocha'});
  • Create one more file under test directory – mocha.opts and put this line there:
    --require test/setup.js

That’s it.
When you run ‘npm test’ on the project, .env.mocha configuration will be used in every test automatically.

For the sake of insurance and to make sure that I’m not loading production configuration (not to drop all databases), I’ll add one more property into the .env.mocha file and execution of setup.js will continue only in case it is found (e.g. MOCHA_CONFIG_LOADED=yes)

I would also like to have empty tables before running tests. Mocha has various hooks and among them before(), which will be invoked before executing the test suite, if we put it inside ‘describe’. If we declare it globally, then it will be executed only once before all tests. That’s exactly what I need. It would be better if I could put this code in setup.js, but if you try, you’ll find that mocha is not yet loaded on that stage and ‘before’ variable won’t be defined. So, I added hooks.js file under the test directory and described my global hooks there.

If integration tests take too long to execute, it’s possible to configure scripts in package.json and make different commands for running unit and integration tests (separated on a directory level).

Single codebase – multiple platforms: NativeScript and React Native

Lately we needed to rewrite a project with such framework, which could help us build common codebase for several platforms – web, Android and iOS. Currently, we write nearly the same logic for all three making it more expensive and time-consuming.

So, I had these three days for a research. As a result, we will still continue writing native, however I’d like to share this small research on my blog. If you have used any of them (NativeScript, React Native) in a public project, I’d gladly look at it and listen to recommendations.

Let’s dive in: hybrids were ignored from the beginning. Hybrid is a mobile app, which can written using html/css/javascript and is loaded in our app by web view component. Unlike an ordinary website, it can access API of the mobile OS, but it’s rendered just like a site. Native application is far more smooth and performant, user experience is different, so hybrids are generally used for small apps or prototypes.

There are lots of frameworks to create a hybrid: Cordova (ყოფილი PhoneGap), Ionic, Sencha Touch, Mobile Angular UI, etc.

I could find relatively small number of ones, which does not render a site, but map javascript code with native elements and create, in fact, native applications. For instance,

 

NativeScript

Telerik, which you may know from their hybrid library, created NativeScript – a framework, with which one can build a native mobile app using Angular, Typescript and JavaScript.

A virtual machine is started on a phone and using reflection, C++ and OS API it creates a bridge to connect to native components. More details can be found on this page: How NativeScript Works
However, despite of this bridge, you won’t notice a performance lag in comparison with a real native app.

Main pros of the framework:
– Native UI
– Extensible (Easily reuse existing plugins from npm, CocoaPods (iOS) and Gradle (Android) directly in NativeScript projects)
– Cross-Platform: Write and deploy native mobile apps for iOS, Android and (soon) Windows from a single code base
– Open source: Apache 2 license.

The learning curve was quite low, too. Recently, they took Angular framework and used the same naming and api, so if you have experience with the latter, it should be easy for you.

Unlike a website, in NativeScript and React Native we don’t create views with html, but with xml, which resembles views in Android.

<StackLayout>
	<Button ios:text="foo" android:text="bar"></Button>
</StackLayout>

We can specify different logic for different platforms. Button code above is one of examples. In case of a JS code, we can write if (page.ios) { … }

Part of the design – colors, font size, flex layout – is done with css. Clearly, not all css rules are available, but there are ones which could be mapped with native components.

The reason for my attraction was the possibility of sharing codebase with a web-app. However this turned out not so simple. Large part was still remaining separate. Some parts of javascript code can be shared if you export it as a npm module. (There also is another way, where you unite everything in a single project like this, but in my opinion it added too many if statements and complexity, so I didn’t really like it on the first look).

As far as mobile platforms are concerned, almost all code can be shared here.

To do some experiments, I selected out one relatively complex page and started to build it with this framework. Some things were pretty easy to do, but when I needed a bit more demanding UI, I got stuck. One example was a bottom tab bar of Android, which is currently impossible to use with this NativeScript. Actually, I was a bit skeptical from the beginning, when I saw their showcase. The apps seemed too simple. But this is still a new framework. Probably it will be much better in couple of years.

 

React Native

This one had a far more impressive gallery. It has been used in thousands of apps. React Native framework comes from Facebook and looks like React. When I tried it out, I definitely agreed to some articles, stating that unlike NativeScript, here you have far more control over the UI, so you are free in your designs, but more control requires more code.

React Native also tries to have a shared codebase, but it still focuses on differences among various platforms. As they say, in case of large apps, you won’t be able to share 100% of code between iOS and Android. Sure, you can use third-party plugins and make them lookalike, but if you ‘respect’ the features of different platforms, you will have to write some code separately.
A simple example would be a date picker, as its visual and ux is radically different on iOS and Android.

React Native tries not to strip off these specific traits of the platforms.
Some people nag about the licence as its published with BSD, however I could not see much constraints.

 

Conclusion

If I were starting a new project with a complex UI and not rewriting one, probably I would go with React Native. However, it’s not like you will easily start coding the next day. Even for only those few pages I still had to go through lots of trial and error, because of some version compatibility and also the framework was new to me. Android skills go hand-in-hand with both of these frameworks as the process of building UI looks similar.

There is one more – Xamarin, and you need C# here. It was a Windows world, so at this time – not for us. By the way, Xamarin acts a bit differently. E.g. iOS project is directly compiled into assembly. For Android it uses a Mono .Net framework. More details can be found here: Understanding the Xamarin mobile platform

I could not find other similar frameworks. Do you know any?

JavaScript Tricks

Photo from jstips

jstips-header-blog

Some time ago, I needed to dive in a JavaScript library which was minified and in the process I found out some ways which are used by the minifiers to reduce a code size. Clearly, the most optimal way is to set 1-2 characters as a variable name. Apart from it, there are some more, which no programmer would do on a clear mind, but just for fun 😀 I offer few of them to guess:
(And then also some nice ones, too)

1. How would you write false with 2 symbols? and true?

!1 false, !0 true

2. How would you reduce this code?

if (a) b();

What about these?

if (!a) b();

if (a) {
b = 1;
c = 2;
}

In JavaScript && and || are short-circuit operators, like in other languages. So, when the result of expression is clear, the rest of the expression, including function invocations, is not evaluated. Instead of if, we can write a&&b().
In the second case, we could have a||b().
The third case is nothing much. Sometimes minifiers just combine everything in one expression and then no blocks are necessary.
if (a) b=1, c=2

3. How would you check something like this with shortest code: If variable is undefined, 0, false, “”, null or NaN – return false, else return true.
(This was not in a minifier, I just often use it).

By converting to bool: result = !!a

4. How would you reduce this part? Let’s say we are validating some arg and if it’s not defined, we set a default value. This one is not from the minifier either.

var options = (arg) ? arg : [];

In case of JavaScript && and || act a bit differently – they don’t return boolean, but a first operand, which determines the expression result. E.g. 1 && 0 && false will return 0.

Consequently, we can write the given expression nicely like this:

var options = arg || [];

4. Minifier does not do this, but, let’s say we need to round number (towards zero). What is the shortest and fastest way to do this?

Let me write the logic. We have a function Math.floor(), but this rounds the number towards negative infinity. So 12.6 will be rounded to 12, but -12.6 will be rounded to -13. This is not the case with our question (Generally, I need rounding towards zero more often than the other).
So we get:

foo > 0 ? Math.floor(foo) : Math.ceil(foo)

If we are not sure that foo is a number and don’t want to recieve NaN, than we’ll need to add another check:

typeof foo === 'number' && !isNaN(foo) && foo !== Infinity
? foo > 0 ? Math.floor(foo) : Math.ceil(foo) : 0;

So the question is, how would you write this with only 2 symbols 🙂 (apart from foo)

We will need some binary operations here. We have two options – double binary division or binary ‘or’ with zero.
~~foo
foo|0

If you’d like to see more details of binary operations, this article has some of them.

5. How would you reduce symbols in numbers with lots of zeros – 1000000, 0.0000023?

1e6, 23e-7 – Counting zeroes is bad 😀

Programming a CNC machine with JavaScript

Two days ago I received my Masters degree at the Tbilisi State University. My research of the last semester was about CNC machines and their programming language. The topic is not quite near to my specialty, but it was still interesting. Especially, the experiments with a new programming language of this machine, which is described in the paper.

As far as I know, there are several companies in Georgia, which use these machines, but I did not receive feedback from them.

Have you carried out experiments or built the machine at home? My paper is published on this link (but it is in Georgian) 🙂

lcamtuf_robot1

Making a mold for robot parts. The photo is taken from lcamtuf.

othermill

Othermill by otherfab