Julia: Hello World
16 Dec 2019Let’s begin with the biggest cliche in the programming world.
Installing Julia and IDE
The official Julia distribution can be downloaded from Julia’s website. There are multiple ways to develop with the Julia language. First of all, we can install the IJulia package to use Julia within Jupyter Notebook. To add a package, press “]” in side the Julia REPL and use the “add” command.
_ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.3.0 (2019-11-26) _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release |__/ | (v1.3) pkg> add IJulia
Another way of writing Julia effectively is to use the Juno IDE, which is powered by Atom. The Atom editor can be downloaded from the official site, and the Juno IDE can be installed with Atom’s package manager.
snap
on UbuntuOn Ubuntu, the Atom editor can be installed with snap
utility.
sudo snap install atom --classic
In theory, Julia can also be installed with snap
as well.
sudo snap install julia --classic
However, when I wrote this post, there was a compatibility issue. The Julia version provided by snap
seems to be too old for Juno to recognize. The Julia distribution downloaded from the official site worked fine with Juno.
Hello World
Let’s try a more unique hello world task. It can be observed that there are 8 distinct characters in the phrase “hello world”. They are “h”, “e”, “l”, “o”, “ “, “w”, “r” and “d”. We can use the corresponding ASCII values of each character to represent it. We assume that these are the target variable of a particular linear regression task. That is, we manually synthesize a dataset that can regress to these particular values. Then the phrase “hello world” can be generated by a set of data points.
First, we use the following Python program to generate the dataset and serialize it with HDF5, which is a cross-language format.
import numpy as np
import h5py
np.random.seed(0x1a2b3c4f)
# create a random weight vector
w = np.random.uniform(1.0, 8.0, 10)
sign = np.random.randint(0, 2, w.size)
print(sign)
sign[sign == 0] = -1
w = w * sign
print(w)
phrase = 'hello world'
phraseLetters = [ch for ch in phrase]
uniqueLetters = sorted(list(set(phrase)))
print(uniqueLetters)
ints = [ord(ch) for ch in uniqueLetters]
print(ints)
# for each integer, generate some training samples
numSample = 50
def get_random_vector(targetNum):
# generate a random vector
randVec = np.random.uniform(-5.0, 5.0, w.size)
# randomly select an index to replace
indSel = np.random.randint(0, w.size)
ip = randVec @ w - randVec[indSel] * w[indSel]
diff = float(targetNum) - ip
replaceVal = diff / w[indSel]
finalVec = randVec.copy()
finalVec[indSel] = replaceVal
assert abs(finalVec.dot(w) - targetNum) < 1.0e-10
return finalVec
trainVecs = dict()
for num in ints:
randVecs = [get_random_vector(num) for _ in range(numSample)]
randVecs = np.stack(randVecs, axis=0)
randVecs += np.random.uniform(-0.5, 0.5, randVecs.shape)
trainVecs[num] = randVecs
train_x = []
train_y = []
for key, val in trainVecs.items():
y = np.zeros(val.shape[0], val.dtype)
y.fill(key)
y += np.random.uniform(-0.2, 0.2, y.shape)
train_x.append(val)
train_y.append(y)
train_x = np.concatenate(train_x, axis=0)
train_y = np.concatenate(train_y)
test_x = []
# generate the prediction set
for ch in phrase:
ind = ord(ch)
vec = get_random_vector(ind)
test_x.append(vec)
test_x = np.stack(test_x, axis=0)
print('data generation finished')
file = h5py.File('hw_data.h5', 'w')
file['train_x'] = train_x
file['train_y'] = train_y
file['test_x'] = test_x
file.close()
Then, in Julia, we read the data and run Ridge regression.
We get the “hello world” that we want!