Perl

Arthur Smyles

About and not About

This talk is about Perl 5.

The classic perl that most have on their machines

Perl 6, which is a remake, that is still going on.

What is Perl

  • Perl is a big language.
  • Perl is a dynamic language

Why Perl?

  • Large set of libraries
  • writing utilities for unix.
  • processing text files. Friendly syntax to access files or pipes and process them.
  • It's everywhere

BEGIN { $| = 1}

Data

undef

There is a value called undef.

It is the default value of a variable if it has not been initialized.

you can undefine a variable by calling undef EXPR.

$Scalars

Scalars are single item values.

They have various flavors:

  • Strings
  • Numbers
  • References: A pointer to some other object,array,hash,scalar. More on that in a minute.

Strings

Strings can be interpolated or literal.

Interpolated strings splice the value of variables into the string.

For example:

"Hello $world"

the value of the $world variable will be put into the string.

To make a string literal, use single quotes, 'Hello $world'

(List,s)

Lists are used for the construction of arrays and hashes as well as for scalars. To assign multiple scalars you can do:

($a,$b,$c)=(1,2,3);

Lists are always flattened so

(1,2,(3,4),5)
is the same as
(1,2,3,4,5)

Booleans

There is not a defined boolean type. However certain values are defined as false.

These are:

  • undef
  • the number 0
  • the string '0'
  • the empty string ''
  • the empty list ()

Anything not false is true

@Arrays

A list of values with ordinal positions.


@a = (1,2,3);
say $a[0]; #prints 1
						

You can also retrieve multiple values from an array by listing the ordinals

($first,$third)=@a[0,2];

%Hashes

Hashes are associative arrays. => is just a synonym for ,

%abbr = ("ny" => "New York", "nj" => "texas")

Use squigly brackets to retrieve the values

print $abbr{'ny'}; #prints New York

And just like arrays, you can get multiple values of a hash by listing them

($ny,$nj)=@abbr{'ny','nj'};

\References

A pointer to some other object,array,hash,scalar, or other references. To reference something use \


my $hello="hello";
my @hello=("how","are","you");
my %hello=("I am" => "fine", "And you?" => "good");
my $ref1=\$hello; #reference to a scalar
my $ref2=\@hello; #reference to an array
my $ref3=\%hello; #reference to a hash;
my $ref4=\$ref3; #reference to a $ref3.
						

To deref, use the scalar sigil ($)


say $$ref1; # prints hello
say @$ref2; # prints howareyou
say $$ref3{"I am"}; #prints fine
say $$$ref4{"And you?"}; #prints good
						

Identifiers and Sigils

we have already seen some examples of identifiers and sigils.

An identifier is a name, the sigil is a symbol

$hello

The best way to think of an identifier is as a symbol with multiple slots.

The slots are denoted by the sigil.

There is a slot for a scalar value,

a slot for an array value,

a slot for a hash,

a slot for a subroutine etc.

Packages::

Identifiers are namespaced by packages.

package names are separated by ::

So hello is a shorthand for main::hello.

One more example of a long package.

This::Is::A::Deep::hello would be located in the (This::Is::A::Deep) package

*Typeglobs

So we have packages and identifiers.

They exist in a symbol table.

Typeglobs are your handles to the system table.

One of the things you can do is create aliases with them: *hello = *world.

Remember, an identifier has slots, so that statement will alias all the variables. So $hello will be $world, @hello will be @world, etc.

[Anonymous,Arrays]

But what if you want a list of lists, we know this won't work

my @a=(1,2,(3,4),5);

That is where anonymous arrays come in, use [] instead of ()


my @a=(1,2,[3,4],5);
say $a[2][1]; # prints 4.
						

one more thing. \@a is [1,2,[3,4],5]

{Anonymous => Hash}

Just same idea as anonymous arrays, use {}

my $abbr = {"ny" => "New York", "nj" => "texas"};

Data ginsu

map { CODE } @array

like map in other languages, except uses a block instead of a function.

grep { CODE } @array

grep is a general filter function. It will return the values of the list where code block returns true

pop/push

adds or removes a value at the end of an array

shift/unshift

adds or removes a value at the front of an array

&Subroutines

Subroutines are what your writing most of the time.

Unlike other languages, there are no argument signatures. There is a special variable @_ which contains the arguments. Using what we have learned you can set up local variables

Like other languages, subroutines are data so you can pass them around.

Perl also has closures

by default, subroutines return the results of their last statement.


sub create_hello {
   my $zero=$_[0];
   sub {
	  my ($first,$second,@rest)=@_;
	  say "Hello ",$zero,$first,$second,@rest;
   };
};
						

FILEHANDLES

File handles are use for reading and writing to files. Some common ones are STDIN,STDOUT,STDERR

You can make your own with open function.

Scope and Context

Lexical Scope

declare a lexical variable with my

You almost always want to do this.

my $var="value";

Dynamic Scope

If you want to have a dynamically scoped variable then declare it using our

Then use local to change the value


our $hello = "Hello";
sub hello {
   say $hello;
}
{
   local $hello="world";
   hello(); #prints world
}
hello(); #prints hello
						

Context

Perl has a concept of context. The 2 main contexts are scalar and list context. It is a way for the programmer to tell the subroutine whether it wants one value or many values.

One way this is achieved is by the variable you are assigning to (called the l-value in Perl docs).


my $val = @array; #it is in scalar context, returns the size of the array.
my @val = @array; #list context, copies the array.
						

if your unsure, use the scalar function to force a scalar context

scalar(@array)

Alot of operators and functions do different things depending on this context. You can play too, by using the wantarray function.

Special variable $_

There are alot of special variables in Perl. We already saw @_

The one you will use the most will be $_.

This is a dynamic scoped variable, set or used by operators and control structures

Operators

Comparison operators

  • For Numbers: >,<,>=,<=,==,!=
  • For Strings: gt,lt,ge,le,eq,nq
  • <=> returns -1,0,1 for its arguments (numerical sort)
  • and cmp is the same for strings (lexiographic sort)

Additive Operators

Binary + returns the sum of two numbers.

Binary - returns the difference of two numbers.

Binary . concatenates two strings.

Multiplicative operators

Binary * multiplies two numbers.

Binary / divides two numbers.

Binary % is the modulo operator

Binary x is the repetition operator.

say "L" . ("OL" x 3) # prints LOLOLOL 

Assignments

You all know what this means $a = 5

and your also familiar with $a += 2

which is the same as $a=$a + 2

In perl most binary ops that are combined by the equal sign do the same as the above.

For example, to concatenate to a string $a .= "es"

/Regex/

use /PATTERN/ to match a string.

In list context, it returns the submatches

In scalar context, it returns true or false if it matches

use s/PATTERN/RESULT/to perform substitutions

by default, these regex operators operate on $_. To operate on another value use =~

finally, to precompile and pass one around use qr/PATTERN/.

qw(Quote like)

Replace {} with any delimiter you like

CustomaryGenericMeaningInterpolates
''q{}Literalno
""qq{}Literalyes
``qx{}Commandyes*
qw{}Word listno
//m{}Pattern matchyes*
qr{}Patternyes*
s{}{}Substitutionyes*
tr{}{}Transliterationno (but see below)
y{}{}Transliterationno (but see below)
>>EOFhere-docyes*
* unless the delimiter is ''.

Range..Operator

In list context, returns a list of items. For example 1..10 returns (1,2,3,4,5,6,7,8,9,10)

In scalar mode, returns true or false. Works just like awk ranges.


while (<>){
   print if (/BEGIN/../END/); #prints all lines when a line matches BEGIN until a line matches END
}
						

and or not

&&, || are short circuit operators. They return last true value, or the first respectively

and, and or are the same, but much lower precedence.

! and not are also the same, except for precedence.

// is defined-or. It will return the first value if defined, the second otherwise

File test operators

You may be familiar with these file tests in bash. You can also use them in Perl.

Here is a small sample of them:

-e File exists.
-z File has zero size (is empty).
-s File has nonzero size (returns size in bytes).
-f File is a plain file.
-d File is a directory.
						

File Read Operator

To read a line from a file handler use <FILEHANDLE>.


my $line=<STDIN>; #reads next line from stdin
my @file=<$fh>; #reads the entire file and puts it in the @file array
						

Input from <> comes from either standard input, or from each file listed in @ARGV

@ARGV is the array of parameters passed to the script

Statements

Simple Statements

Ever do this.

if (condition){ statement; }

Instead you can write this:

statement if (condition);

These are called statement modifiers. You also have unless, while, until, for, foreach, and when

print while (<>); # implementation of unix cat program

Compound Statements


if (EXPR) BLOCK
if (EXPR) BLOCK else BLOCK
if (EXPR) BLOCK elsif (EXPR) BLOCK ...
if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
unless (EXPR) BLOCK
unless (EXPR) BLOCK else BLOCK
unless (EXPR) BLOCK elsif (EXPR) BLOCK ...
unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
given (EXPR) BLOCK
LABEL while (EXPR) BLOCK
LABEL while (EXPR) BLOCK continue BLOCK
LABEL until (EXPR) BLOCK
LABEL until (EXPR) BLOCK continue BLOCK
LABEL for (EXPR; EXPR; EXPR) BLOCK
LABEL for VAR (LIST) BLOCK
LABEL for VAR (LIST) BLOCK continue BLOCK
LABEL foreach (EXPR; EXPR; EXPR) BLOCK
LABEL foreach VAR (LIST) BLOCK
LABEL foreach VAR (LIST) BLOCK continue BLOCK
LABEL BLOCK
LABEL BLOCK continue BLOCK
						

What are blocks?

In perl, a block by itself is a loop that executes once. A block can have an optional continue, which is group of statements that run before checking the loop condition in a while/until/for/foreach

Loop control

There are three loop control instructions.

  • next: goes to the continue block, if any, and continues iteration. In other languages, this is the same as continue
  • last: exits the loop without running the condition block. Same as break in other languages
  • redo: goes back to the begining of the loop block. Does not check the loop condition.

given is switch

given is used like switch in other languages. The cooresponding name for case is when.


given ($var) {
   when (/^abc/) { $abc = 1 }
   when (/^def/) { $def = 1 }
   when (/^xyz/) { $xyz = 1 }
   default { $nothing = 1 }
}
						

given can work with strings or numbers.

Exception handling

Normally, calling die will kill the script

If you want to do more safe programming, use eval


 # make divide-by-zero nonfatal
eval { $answer = $a / $b; }; warn $@ if $@;
					

Error strings are stored in $@

You can also use eval how you would expect.

eval 'print "hello";'

Recommendation: Use Tiny::Try or TryCatch libraries.

Objects

The basic object system in perl is quite minimal.

There are many object oriented frameworks going from Moose which is an object oriented system which includes it's own metaobject protocol to Object::Tiny which is a small system.

They all extend and facilitate development in the basic object system presented here.

An Object is Simply a Data Structure

What this means is at it's core, an object is a standard scalar/array/hash that has been blessed

bless is a function that associates the scalar/array/hash to a class name

Objects are not encapsulated. You can access the data directly using the deref operator (->).

A Class is Simply a Package

A class in Perl is simply a package. That means all class are really no different than any other package. The only difference is that we associate a data structure with the package as opposed to using it simply as a namespace.

Inheritence is achieved by using the parent pragma


  package Class;

  use parent SuperClass;
						

if you override a method then you can call the superclasses method by using the SUPER:: psuedo class


sub override {
	my $self=shift;
	$self->SUPER::override();
}	
						

A Method is Simply a Subroutine

That means a method is no different than a subroutine.

How does that work?

Lets look at the following:


my $obj = Class->new();
$obj->do_something();
						

The second line can be interpreted as:

call the do_something subroutine defined in package Class with the first argument $obj.

The first line can be interpreted as:

call the new subroutine defined in package Class with the first argument "Class".

In Java, these would be referred to as static methods.

tie

What if you have a data structure like a scalar, an array, or a hash.

Perl provides the capability to tie a scalar, array, or hash to a class.

Then you can access it like those variables in the language.

For example, treating a key/value database as a hash

Or a special random scalar that generates a random number every time you read it

Others, will be left to your imagination...

Example Utility

Example

Conclusion

Now that you are over the hump

Review the slides <smyles.com/perl>.

or dive in <perldoc.perl.org/index-language.html>.

or go for a stroll<perl-tutorial.org/>.

END { questions(); }