Julia DataFrames

 

Filtering DataFrame


New Syntax

 

Syntax explained:

  1. from original "data" (type DataFrames)
  2. create new "dataA" (type DataFrames)
  3. filter indices in column "Treatment" with values equal to "A"
  4. include ALL COLUMNS indicated by ":"
  5. Show the first 6 rows


dataA = data[isequal.(data.Treatment, "A"), : ] 

first(dataA, 6)



6 rows × 5 columns

AgeWCCCRPTreatmentResult
Int64Float64Int64StringString
13710.230AWorse
26710.470AWorse
34210.4100AStatic
45612.580AWorse
5609.50AStatic
66213.710AWorse

 


Deprecated Syntax

# dataA = data[data[:Treatment] .== "A", :] 

 

Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.



Julia packages (Pkg)

 Here are the Julia packages I have been using so far, I keep this list in case I need to re-install Julia.

Un-comment, install and pre-compile the package you need:


import Pkg

# Pkg.add("Atom") 

# using Atom

 

# Pkg.add("BinDeps") 

# using BinDeps

 

# Pkg.add("CSV") 

# using CSV

 

# Pkg.add("CUDAapi") 

# using CUDAapi

 

# Pkg.add("Calculus") 

# using Calculus

 

# Pkg.add("CodecZlib") 

# using CodecZlib

 

# Pkg.add("Codecs") 

# using Codecs

 

# Pkg.add("ColorTypes") 

# using ColorTypes

 

# Pkg.add("Colors") 

# using Colors

 

# Pkg.add("Combinatorics") 

# using Combinatorics

 

# Pkg.add("Compat") 

# using Compat

 

# Pkg.add("Compose") 

# using Compose

 

# Pkg.add("Contour") 

# using Contour

 

# Pkg.add("DataArrays") 

# using DataArrays

 

# Pkg.add("DataFrames") 

# using DataFrames

 

# Pkg.add("DataStructures") 

# using DataStructures

 

# Pkg.add("Distances") 

# using Distances

 

# Pkg.add("Distributions") 

# using Distributions

 

# Pkg.add("FileIO") 

# using FileIO

 

# Pkg.add("FixedPointNumbers") 

# using FixedPointNumbers

 

# Pkg.add("Flux") 

# using Flux

 

# Pkg.add("ForwardDiff") 

# using ForwardDiff

 

# Pkg.add("GLM") 

# using GLM

 

# Pkg.add("GR") 

# using GR

 

# Pkg.add("GZip") 

# using GZip

 

# Pkg.add("GeometryBasics") 

# using GeometryBasics

 

# Pkg.add("Hexagons") 

# using Hexagons

 

# Pkg.add("HypothesisTests") 

# using HypothesisTests

 

# Pkg.add("IJulia") 

# using IJulia

 

# Pkg.add("JSON") 

# using JSON

 

# Pkg.add("Juno") 

# using Juno

 

# Pkg.add("KernelDensity") 

# using KernelDensity

 

# Pkg.add("Knet") 

# using Knet

 

# Pkg.add("Lathe") 

# using Lathe

 

# Pkg.add("Loess") 

# using Loess

 

# Pkg.add("Measures") 

# using Measures

 

# Pkg.add("Metalhead") 

# using Metalhead

 

# Pkg.add("NaNMath") 

# using NaNMath

 

# Pkg.add("OffsetArrays") 

# using OffsetArrays

 

# Pkg.add("Optim") 

# using Optim

 

# Pkg.add("PDMats") 

# using PDMats

 

# Pkg.add("Parameters") 

# using Parameters

 

# Pkg.add("Plots") 

# using Plots

 

# Pkg.add("Polynomials") 

# using Polynomials

 

# Pkg.add("Primes") 

# using Primes

 

# Pkg.add("PyCall") 

# using PyCall

 

# Pkg.add("PyPlot") 

# using PyPlot

 

# Pkg.add("QuartzImageIO") 

# using QuartzImageIO

 

# Pkg.add("RDatasets") 

# using RDatasets

 

# Pkg.add("Reexport") 

# using Reexport

 

# Pkg.add("Rmath") 

# using Rmath

 

# Pkg.add("Roots") 

# using Roots

 

# Pkg.add("Showoff") 

# using Showoff

 

# Pkg.add("SortingAlgorithms") 

# using SortingAlgorithms

 

# Pkg.add("StaticArrays") 

# using StaticArrays

 

# Pkg.add("StatsBase") 

# using StatsBase

 

# Pkg.add("StatsFuns") 

# using StatsFuns

 

# Pkg.add("StatsPlots") 

# using StatsPlots

 

# Pkg.add("TensorFlow") 

# using TensorFlow

 

# Pkg.add("TextAnalysis") 

# using TextAnalysis

 

# Pkg.add("TinySegmenter") 

# using TinySegmenter

 

# Pkg.add("URIParser") 

# using URIParser

 

# Pkg.add("UnicodePlots") 

# using UnicodePlots

 

# Pkg.add("WordTokenizers") 

# using WordTokenizers

 

# Pkg.add("Dates") 

# using Dates

 

# Pkg.add("DelimitedFiles") 

# using DelimitedFiles

 

# Pkg.add("SHA") 

# using SHA

 

# Pkg.add("Statistics") 

# using Statistics

"Rata Die". Counting Days since...

 

Counting Days since...

https://en.wikipedia.org/wiki/Rata_Die



# col1 is an exported column of dates

lastDay = 54 # last row in the column sorted by the date

daysSince(x) = Dates.datetime2rata(x) - Dates.datetime2rata(col1[lastDay])

arrayDays = Array{Int64}(undef, lastDay) # array size is 54

for i = 1:lastDay

    arrayDays[i] = daysSince(col1[i])

end

arrayDays


Output:

54-element Array{Int64,1}:
 613
 606
 599
 592
 582
 575
 568
 554
 547
 540
 533
 526
 512
   ⋮

Flux: The Julia Machine Learning Library

 

Flux is a machine learning library writer in Julia language.



Installation

(v1.3) pkg> add Flux

Julia language, read delimited files (CSV) with "readdlm"

 

When trying to execute:


wikiEVDraw = readdlm("wikipediaEVDraw.csv", ',') # getting quotes right is important!



I get the following error:

UndefVarError: readdlm not defined.


help?> readdlm search: readdir Couldn't find readdlm Perhaps you meant readdir, read, read!, real or readchomp No documentation found. Binding readdlm does not exist.


In this case use:

using DelimitedFiles wikiEVDraw = readdlm("wikipediaEVDraw.csv", ',') # getting quotes right is important! 

Using Julia Language with AWS Lambda functions.

 

Here is a Julia language library to operate with AWS Lamda functions:

https://github.com/samoconnor/AWSLambda.jl#deploy-a-julia-lambda-function



Julia

The best way to start with Julia is to install JuliaPro, for free.



Arrays in Julia Language

 Way to create an Array in Julia language:


A simple way of creating an array


array18 = fill(18, 4) # values of 18, 4 rows


4-element Array{Int64,1}:
 18
 18
 18
 18




Float64, 2-dimensional,  4-rows, 3-columns

A = Array{Float64,2}(undef, 4,3) 

# Array of floats, 2-dimentional, populate with undefiled data, 4-rows, 3-columns


4×3 Array{Float64,2}:
 2.40692e-314  5.0e-324      1.5e-323
 2.39671e-314  2.40692e-314  1.5e-323
 2.40692e-314  2.40692e-314  1.0e-323 
 2.40692e-314  2.40692e-314  2.38725e-314 


A[1,1] = 123

A




4×3 Array{Float64,2}:
 123.0           5.0e-324      2.5e-323
   2.38296e-314  2.38338e-314  2.5e-323
   2.38338e-314  2.38338e-314  1.0e-323
   2.38338e-314  2.38338e-314  2.64889e-32


Abstract Type Integer

A = Array{Integer}(undef,2,3)


2×3 Array{Integer,2}:
 #undef  #undef  #undef
 #undef  #undef  #undef


Concreate Type Int64

A = Array{Int64}(undef,2,3)

2×3 Array{Int64,2}:
 4826207760  4826207824  4826207888
 4826207792  4826207856  4825801360






Reference:

Special Characters used for Math, ML, Logic in Jupyther Lab


Here is a quick cheat sheet of useful Unicode characters you can use in Jupyther Lab and Julia language:


Greek Alphabet


α    \alpha
β    \beta
γ    \gamma
    \delta
ϵ    \epsilon
ζ    \zeta
η    \eta
θ    \theta
ι    \iota
κ    \kappa (should look like 'k')
λ    \lambda
μ    \mu
ν    nu (not supported in Jupyther)
ξ    \xi 
ο    omnicron (not supported in Jupyther)
π    \pi
ρ    \rho 
σ    \sigma
τ    \tau
υ    upsilon (not supported in Jupyther)
ϕ    \phi
χ    \chi
ψ    \psi
ω    \omega

Hebrew


ℵ    \aleph wiki

Machine Learning


    x\hat

Math

    \measuredangle
    \sum
±    \pm (plus minus)
    \ne
    \ge (greater or equal)
    \le (less or equal)
    \diameter
    0\degree 

Sets


    \in

Logic

    \therefore
    \xor