Friday, January 4, 2008

Perl FAQ's - 6

How do I find which modules are installed on my system?

You can use the ExtUtils::Installed module to show all installed distributions, although it can take awhile to do its magic. The standard library which comes with Perl just shows up as "Perl" (although you can get those with Module::CoreList).

use ExtUtils::Installed;

my $inst = ExtUtils::Installed->new();
my @modules = $inst->modules();

If you want a list of all of the Perl module filenames, you can use File::Find::Rule.

use File::Find::Rule;

my @files = File::Find::Rule->file()->name( '*.pm' )->in( @INC );

If you do not have that module, you can do the same thing with File::Find which is part of the standard library.

use File::Find;
my @files;

find(
sub {
push @files, $File::Find::name
if -f $File::Find::name && /\.pm$/
},

@INC
);

print join "\n", @files;

If you simply need to quickly check to see if a module is available, you can check for its documentation. If you can read the documentation the module is most likely installed. If you cannot read the documentation, the module might not have any (in rare cases).

prompt% perldoc Module::Name

You can also try to include the module in a one-liner to see if perl finds it.

perl -MModule::Name -e1


How can I compare two dates and find the difference?

You could just store all your dates as a number and then subtract. Life isn't always that simple though. If you want to work with formatted dates, the Date::Manip, Date::Calc, or DateTime modules can help you.

How do I remove consecutive pairs of characters?

You can use the substitution operator to find pairs of characters (or runs of characters) and replace them with a single instance. In this substitution, we find a character in (.). The memory parentheses store the matched character in the back-reference \1 and we use that to require that the same thing immediately follow it. We replace that part of the string with the character in $1.

s/(.)\1/$1/g;

We can also use the transliteration operator, tr///. In this example, the search list side of our tr/// contains nothing, but the c option complements that so it contains everything. The replacement list also contains nothing, so the transliteration is almost a no-op since it won't do any replacements (or more exactly, replace the character with itself). However, the s option squashes duplicated and consecutive characters in the string so a character does not show up next to itself

my $str = 'Haarlem'; # in the Netherlands
$str =~ tr///cs; # Now Harlem, like in New York


How can I access or change N characters of a string?


You can access the first characters of a string with substr(). To get the first character, for example, start at position 0 and grab the string of length 1.

$string = "Just another Perl Hacker";
$first_char = substr( $string, 0, 1 ); # 'J'

To change part of a string, you can use the optional fourth argument which is the replacement string.

substr( $string, 13, 4, "Perl 5.8.0" );

You can also use substr() as an lvalue.

substr( $string, 13, 4 ) = "Perl 5.8.0";


How do I change the Nth occurrence of something?

You have to keep track of N yourself. For example, let's say you want to change the fifth occurrence of "whoever" or "whomever" into "whosoever" or "whomsoever", case insensitively. These all assume that $_ contains the string to be altered.

$count = 0;
s{((whom?)ever)}{
++$count == 5 # is it the 5th?
? "${2}soever" # yes, swap
: $1 # renege and leave it there
}ige;

In the more general case, you can use the /g modifier in a while loop, keeping count of matches.

$WANT = 3;
$count = 0;
$_ = "One fish two fish red fish blue fish";
while (/(\w+)\s+fish\b/gi) {
if (++$count == $WANT) {
print "The third fish is a $1 one.\n";
}
}

That prints out: "The third fish is a red one." You can also use a repetition count and repeated pattern like this:

/(?:\w+\s+fish\s+){2}(\w+)\s+fish/i;


How can I count the number of occurrences of a substring within a string?

There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the tr/// function like so:

$string = "ThisXlineXhasXsomeXx'sXinXit";
$count = ($string =~ tr/X//);
print "There are $count X characters in the string";

This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, tr/// won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers:

$string = "-9 55 48 -2 23 -76 4 14 -44";
while ($string =~ /-\d+/g) { $count++ }
print "There are $count negative numbers in the string";

Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches.

$count = () = $string =~ /-\d+/g;


How do I capitalize all the words on one line?

To make the first letter of each word upper case:

$line =~ s/\b(\w)/\U$1/g;

This has the strange effect of turning "don't do it" into "Don'T Do It". Sometimes you might want this. Other times you might need a more thorough solution (Suggested by brian d foy):

$string =~ s/ (
(^\w) #at the beginning of the line
| # or
(\s\w) #preceded by whitespace
)
/\U$1/xg;

$string =~ s/([\w']+)/\u\L$1/g;

To make the whole line upper case:

$line = uc($line);

To force each word to be lower case, with the first letter upper case:

$line =~ s/(\w+)/\u\L$1/g;


What is the difference between a list and an array?

An array has a changeable length. A list does not. An array is something you can push or pop, while a list is a set of values. Some people make the distinction that a list is a value while an array is a variable. Subroutines are passed and return lists, you put things into list context, you initialize arrays with lists, and you foreach() across a list. @ variables are arrays, anonymous arrays are arrays, arrays in scalar context behave like the number of elements in them, subroutines access their arguments through the array @_, and push/pop/shift only work on arrays.

As a side note, there's no such thing as a list in scalar context. When you say

$scalar = (2, 5, 7, 9);

you're using the comma operator in scalar context, so it uses the scalar comma operator. There never was a list there at all! This causes the last value to be returned: 9.


Why does defined() return true on empty arrays and hashes?

The short story is that you should probably only use defined on scalars or functions, not on aggregates (arrays and hashes).


How do I sort a hash (optionally by value instead of key)?

To sort a hash, start with the keys. In this example, we give the list of keys to the sort function which then compares them ASCIIbetically (which might be affected by your locale settings). The output list has the keys in ASCIIbetical order. Once we have the keys, we can go through them to create a report which lists the keys in ASCIIbetical order.

my @keys = sort { $a cmp $b } keys %hash;

foreach my $key ( @keys )
{
printf "%-20s %6d\n", $key, $hash{$value};
}

We could get more fancy in the sort() block though. Instead of comparing the keys, we can compute a value with them and use that value as the comparison.

For instance, to make our report order case-insensitive, we use the \L sequence in a double-quoted string to make everything lowercase. The sort() block then compares the lowercased values to determine in which order to put the keys.

my @keys = sort { "\L$a" cmp "\L$b" } keys %hash;

Note: if the computation is expensive or the hash has many elements, you may want to look at the Schwartzian Transform to cache the computation results.

If we want to sort by the hash value instead, we use the hash key to look it up. We still get out a list of keys, but this time they are ordered by their value.

my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash;

From there we can get more complex. If the hash values are the same, we can provide a secondary sort on the hash key.

my @keys = sort {
$hash{$a} <=> $hash{$b}
or
"\L$a" cmp "\L$b"
} keys %hash;


How can I make my hash remember the order I put elements into it?

Use the Tie::IxHash from CPAN.

use Tie::IxHash;

tie my %myhash, 'Tie::IxHash';

for (my $i=0; $i<20; $i++) {
$myhash{$i} = 2*$i;
}

my @keys = keys %myhash;
# @keys = (0,1,2,3,...)


What's the difference between "delete" and "undef" with hashes?

Hashes contain pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string, although the value can be any kind of scalar: string, number, or reference. If a key $key is present in %hash, exists($hash{$key}) will return true. The value for a given key can be undef, in which case $hash{$key} will be undef while exists $hash{$key} will return true. This corresponds to ($key, undef) being in the hash.

Pictures help... here's the %hash table:

keys values
+------+------+
| a | 3 |
| x | 7 |
| d | 0 |
| e | 2 |
+------+------+

And these conditions hold

$hash{'a'} is true
$hash{'d'} is false
defined $hash{'d'} is true
defined $hash{'a'} is true
exists $hash{'a'} is true (Perl 5 only)
grep ($_ eq 'a', keys %hash) is true

If you now say

undef $hash{'a'}

your table now reads:

keys values
+------+------+
| a | undef|
| x | 7 |
| d | 0 |
| e | 2 |
+------+------+

and these conditions now hold; changes in caps:

$hash{'a'} is FALSE
$hash{'d'} is false
defined $hash{'d'} is true
defined $hash{'a'} is FALSE
exists $hash{'a'} is true (Perl 5 only)
grep ($_ eq 'a', keys %hash) is true

Notice the last two: you have an undef value, but a defined key!

Now, consider this:

delete $hash{'a'}

your table now reads:

keys values
+------+------+
| x | 7 |
| d | 0 |
| e | 2 |
+------+------+

and these conditions now hold; changes in caps:

$hash{'a'} is false
$hash{'d'} is false
defined $hash{'d'} is true
defined $hash{'a'} is false
exists $hash{'a'} is FALSE (Perl 5 only)
grep ($_ eq 'a', keys %hash) is FALSE

See, the whole entry is gone!

technorati tags:,

1 comment:

tua022012 said...

Thanks for your link. It's useful for our community.
Same material can be found at: Account coordinator interview questions
I hope it's useful for you and you like it. Please continue sharing more information at this topic.
Best rgs!