Great Templates

FEATURED WEB TEMPLATES

Amazon Books

Learn PHP

PHP Training
Zend Cert Training Zend Certified Engineer Zend PHP Pro

Perl: File Functions

File Functions

Table of Contents:

  • Reading Directories
  • Reading and Writing Files
  • Binary Files
  • Getting File Statistics
  • Printing Revisited

File Functions

The following file functions are available in Perl:

  • binmode(FILE_HANDLE) This function puts FILE_HANDLE into a binary mode.
  • chdir( DIR_NAME) Causes your program to use DIR_NAME as the current directory. It will return true if the change was successful, false if not.
  • chmod(MODE, FILE_LIST) This UNIX-based function changes the permissions for a list of files. A count of the number of files whose permissions was changed is returned. There is no DOS equivalent for this function.
  • chown(UID, GID, FILE_LIST) This UNIX-based function changes the owner and group for a list of files. A count of the number of files whose ownership was changed is returned. There is no DOS equivalent for this function.
  • close(FILE_HANDLE) Closes the connection between your program and the file opened with FILE_HANDLE.
  • closedir( DIR_HANDLE) Closes the connection between your program and the directory opened with DIR_HANDLE.
  • eof(FILE_HANDLE) Returns true if the next read on FILE_HANDLE will result in hitting the end of the file or if the file is not open. If FILE_HANDLE is not specified the status of the last file read is returned. All input functions return the undefined value when the end of file is reached, so you\'ll almost never need to use eof().
  • fcntl(FILE_HANDLE, Implements the fcntl() function which lets FUncTION, SCALAR) you perform various file control operations. Its use is beyond the scope of this course.
  • fileno( FILE_HANDLE) Returns the file descriptor for the specified FILE_HANDLE.
  • flock(FILEHANDLE, OPERATION) This function will place a lock on a file so that multiple users or programs can\'t simultaneously use it. The flock() function is beyond the scope of this book.
  • getc(FILE_HANDLE) Reads the next character from FILE_HANDLE. If FILE_HANDLE is not specified, a character will be read from STDIN. glob( EXPRESSION) Returns a list of files that match the specification of EXPRESSION, which can contain wildcards. For instance, glob( "*.pl") will return a list of all Perl program files in the current directory.
  • ioctl(FILE_HANDLE, Implements the ioctl() function which lets FUncTION, SCALAR) you perform various file control operations. Its use is beyond the scope of this book. For more in-depth discussion of this function see Que\'s Special Edition Using Perl for Web Programming.
  • link(OLD_FILE_NAME, This UNIX-based function creates a new NEW_FILE_NAME) file name that is linked to the old file name. It returns true for success and false for failure. There is no DOS equivalent for this function. lstat( FILE_HANDLE_OR_Returns file statistics in a 13-element array. FILE_NAME) lstat() is identical to stat() except that it can also return information about symbolic links.
  • mkdir(DIR_NAME, MODE) Creates a directory named DIR_NAME. If you try to create a subdirectory, the parent must already exist. This function returns false if the directory can\'t be created. The special variable $! is assigned the error message.
  • open(FILE_HANDLE, EXPRESSION) Creates a link between FILE_HANDLE and a file specified by EXPRESSION.
  • opendir( DIR_HANDLE, DIR_NAME) Creates a link between DIR_HANDLE and the directory specified by DIR_NAME. opendir() returns true if successful, false otherwise.
  • pipe(READ_HANDLE), Opens a pair of connected pipes like the WRITE_HANDLE) corresponding system call. Its use is beyond the scope of this book. For more on this function see Que\'s Special Edition Using Perl for Web Programming. print FILE_HANDLE (LIST) Sends a list of strings to FILE_HANDLE. If FILE_HANDLE is not specified, then STDOUT is used.
  • printf FILE_HANDLESends a list of strings in a format specified by (FORMAT, LIST) FORMAT to FILE_HANDLE. If FILE_HANDLE is not specified, then STDOUT is used.
  • read(FILE_HANDLE, BUFFER, Reads bytes from FILE_HANDLE starting at LENGTH,LENGTHOFFSET) OFFSET position in the file into the scalar variable called BUFFER. It returns the number of bytes read or the undefined value.
  • readdir(DIR_HANDLE) Returns the next directory entry from DIR_HANDLE when used in a scalar context. If used in an array context, all of the file entries in DIR_HANDLE will be returned in a list. If there are no more entries to return, the undefined value or a null list will be returned depending on the context.
  • readlink(EXPRESSION) This UNIX-based function returns that value of a symbolic link. If an error occurs, the undefined value is returned and the special variable $! is assigned the error message. The $_ special variable is used if EXPRESSION is not specified.
  • rename(OLD_FILE_NAME, Changes the name of a file. You can use this NEW_FILE_NAME) function to change the directory where a file resides, but not the disk drive or volume.
  • rewinddir(DIR_HANDLE) Resets DIR_HANDLE so that the next readdir() starts at the beginning of the directory.
  • rmdir(DIR_NAME) Deletes an empty directory. If the directory can be deleted it returns false and $! is assigned the error message. The $ special variable is used if DIR_NAME is not specified.
  • seek(FILE_HANDLE, POSITION, Moves to POSITION in the file connected to WHEncE) FILE_HANDLE. The WHEncE parameter determines if POSITION is an offset from the beginning of the file ( WHEncE=0), the current position in the file (WHEncE=1), or the end of the file (WHEncE=2).
  • seekdir(DIR_HANDLE, POSITION) Sets the current position for readdir(). POSITION must be a value returned by the telldir() function.
  • select(FILE_HANDLE) Sets the default FILE_HANDLE for the write() and print() functions. It returns the currently selected file handle so that you may restore it if needed.
  • sprintf(FORMAT, LIST) Returns a string whose format is specified by FORMAT.
  • stat( FILE_HANDLE_OR_Returns file statistics in a 13-element array. FILE_NAME)
  • symlink(OLD_FILE_NAME, This UNIX-based function creates a new NEW_FILE_NAME) file name symbolically linked to the old file name. It returns false if the NEW_FILE_NAME cannot be created.
  • sysread(FILE_HANDLE, BUFFER, Reads LENGTH bytes from FILE_HANDLE starting LENGTH,OFFSET) at OFFSET position in the file into the scalar variable called BUFFER. It returns the number of bytes read or the undefined value.
  • syswrite(FILE_HANDLE, BUFFER, Writes LENGTH bytes from FILE_HANDLE starting LENGTH, OFFSET) at OFFSET position in the file into the scalar variable called BUFFER. It returns the number of bytes written or the undefined value.
  • tell(FILE_HANDLE) Returns the current file position for FILE_HANDLE. If FILE_HANDLE is not specified, the file position for the last file read is returned.
  • telldir(DIR_HANDLE) Returns the current position for DIR_HANDLE. The return value may be passed to seekdir() to access a particular location in a directory.
  • truncate(FILE_HANDLE, LENGTH) Truncates the file opened on FILE_HANDLE to be LENGTH bytes long.
  • unlink(FILE_LIST) Deletes a list of files. If FILE_LIST is not specified, then $ will be used. It returns the number of files successfully deleted. Therefore, it returns false or 0 if no files were deleted.
  • utime( FILE_LIST) This UNIX-based function changes the access and modification times on each file in FILE_LIST.
  • write(FILE_HANDLE) Writes a formatted record to FILE_HANDLE.

Reading Directories

Perl has several functions to operate on functions the opendir(), readdir() and closedir() functions are a common way to achieve this.

opendir(DIR_HANDLE,"directory") returns a Directory handle -- just an identifier (no $) -- for a given directory to be opened for reading.

Note that exact or subpath directories may be required.

BE WARNED: Macintosh directory paths are denoted by : in this instance UNIX directory paths are denoted by /.

readdir(DIR_HANDLE) returns a scalar (string) of the basename of the file (no sub directories (: or /))

closedir(DIR_HANDLE) simply closes the directory.

Therefore to list all files a given directory we can do the following readdir.pl:

opendir(IDIR,"Maclab:Internet") 
|| die "NO SUCH Directory: Images";

while ($file = readdir(DIR) )
{
print " $file\n";

}
closedir(DIR);

The above reads a folder Internet on the top level of the Maclab hard disk.

On UNIX we may do:

opendir(IDIR,"./Internet") 
|| die "NO SUCH Directory: Images";

while ($file = readdir(DIR) )
{
print " $file\n";

}
closedir(DIR);

The above reads a sub-directory Internet assumed to be located in the same directory from where the Perl script has been run.

One further example to alphabetically list files is alpha.pl:

opendir(IDIR,"./Internet") 
|| die "NO SUCH Directory: Images";

foreach $file ( sort readdir(DIR) )
{
print " $file\n";

}
closedir(DIR);

Reading and Writing Files

We have just introduced the concept of a Directory Handle for referring to a Directory on disk.

We now introduce a similar concept of File Handle for referring to a File on disk from which we can read data and to which we can write data.

Similar ideas of opening and closing the files exist.

You use the open() operator to open a file (for reading):

open(FILEHANDLE,"file_on_device");

To open a file for writing you must use the ``>\'\' symbol in the open() operator:

open(FILEHANDLE,">outfile");

Write always starts writing to file at the start of the file. If the file already exists and contains data. The file will be opened and the data overwritten.

To open a file for appending you must use the ``>>\'\' symbol in the open() operator:

open(FILEHANDLE,">>appendfile");

The close() operator closes a file handle:

close(FILEHANDLE);

To read from a file you simply use the command which reads one line at a time from a FILEHANDLE and stores it in a special Perl variable $_.

For example, read.pl:

open(FILE,"myfile") 
|| die "cannot open file";
while()
{ print $_; # echo line read
}
close(FILE);

To write to a file you use the Print command and simply refer to the FILEHANDLE before you format the output string via:

print FILEHANDLE "Output String\n";

Therefore to read from one file infile and copy line by line to another outfile we could do readwrite.pl:

open(IN,"infile") 
|| die "cannot open input file";
open(OUT,"outfile")
|| die "cannot open output file";
while()
{ print OUT $_; # echo line read
}
close(IN);
close(OUT);

Binary Files

When you need to work with data files, you will need to know what binary mode is. There are two major differences between binary mode and text mode:

  • In DOS and Windows, line endings are indicated by two characters-the newline and carriage return characters. When in text mode, these characters are input as a single character, the newline character. In binary mode, both characters can be read by your program. UNIX systems only use one character, the newline, to indicate line endings.
  • In DOS and Windows, the end of file character is 26. When a byte with this value is read in text mode, the file is considered ended and your program cannot read any more information from the file. UNIX considers the end-of-file character to be 4. For both operating systems, binary mode will let the end-of-file character be treated as a regular character.

Note The examples in this section relate to the DOS operating system.

In order to demonstrate these differences, we\'ll use a data file called BINARY.DAT with the following contents:

01
02
03

First, we\'ll read the file in the default text mode.

We procede as follows:

  • Initialize a buffer variable.
  • Both read() and sysread() need their buffer variables to be initialized before the function call is executed.
  • Open the BINARY.DAT file for reading.
  • Read the first 20 characters of the file using the read() function.
  • Close the file.
  • Create an array out of the characters in the $buffer variable and iterate over that array using a foreach loop.
  • Print the value of the current array element in hexadecimal format.
  • Print a newline character. The current array element is a newline character.

The Perl to do this is, binary1.pl:

$buffer = "";
open(FILE, ">binary.dat");

read(FILE, $buffer, 20, 0);

close(FILE);

foreach (split(//, $buffer)) {

printf("%02x ", ord($_));

print "\n" if $_ eq "\n";

}

This program displays:

30 31 0a

30 32 0a

30 33 0a

This example does a couple of things that haven\'t been met before. The Read() function is used as an alternative to the line-by-line input done with the diamond operator. It will read a specified number of bytes from the input file and assign them to a buffer variable. The fourth parameter specifies an offset at which to start reading. In this example, we started at the beginning of the file.

The split() function in the foreach loop breaks a string into pieces and places those pieces into an array. The double slashes indicate that each character in the string should be an element of the new array.

Once the array of characters has been created, the foreach loop iterates over the array. The printf() statement converts the ordinal value of the character into hexadecimal before displaying it. The ordinal value of a character is the value of the ASCII representation of the character. For example, the ordinal value of \'0\' is 0x30 or 48.

The next line, the print statement, forces the output onto a new line if the current character is a newline character. This was done simply to make the output display look a little like the input file.

Now, let\'s read the file in binary mode and see how the output is changed.

The new code is as follow, binary2.pl:

$buffer = "";

open(FILE, "

binmode(FILE);

read(FILE, $buffer, 20, 0);

close(FILE);

foreach (split(//, $buffer)) {

printf("%02x ", ord($_));

print "\n" if $_ eq "\n";

}

This program displays:

30 31 0d 0a

30 32 0d 0a

30 33 0d 0a

When the file is read in binary mode, you can see that there are really two characters at the end of every line-the linefeed and newline characters.

Getting File Statistics

The file test operators can tell you a lot about a file, but sometimes you need more. In those cases, you use the stat() or lstat() function. The stat() returns file information in a 13-element array. You can pass either a file handle or a file name as the parameter. If the file can\'t be found or another error occurs, the null list is returned. The listing below shows how to use the stat() function to find out information about the EOF.DAT file used earlier in the chapter.

The perl code stat.pl is:

($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size,
$atime, $mtime, $ctime, $blksize, $blocks) = stat("eof.dat");


print("dev = $dev\n");

print("ino = $ino\n");

print("mode = $mode\n");

print("nlink = $nlink\n");

print("uid = $uid\n");

print("gid = $gid\n");

print("rdev = $rdev\n");

print("size = $size\n");

print("atime = $atime\n");

print("mtime = $mtime\n");

print("ctime = $ctime\n");

print("blksize = $blksize\n");

print("blocks = $blocks\n");

In the DOS environment, this program displays:

dev     = 2

ino = 0

mode = 33206

nlink = 1

uid = 0

gid = 0

rdev = 2

size = 13

atime = 833137200

mtime = 833195316

ctime = 833194411

blksize =

blocks =

Some of this information is specific to the UNIX environmen and is not displayed here. One interesting piece of information is the $mtime value-the date and time of the last modification made to the file. You can interpret this value by using the following line of code:

($sec, $min, $hr, $day, $month, $year, $day_Of_Week, $julianDate, $dst) = localtime($mtime);

If you are only interested in the modification date, you can use the array slice notation to just grab that value from the 13-element array returned by stat().

For example:

$mtime = (stat("eof.dat"))[9];

Notice that the stat() function is surrounded by parentheses so that the return value is evaluated in an array context. Then the tenth element is assigned to $mtime. You can use this technique whenever a function returns a list.

Printing Revisited

We\'ve been using the print() function throughout this book without really looking at how it works. Let\'s remedy that now.

The print() function is used to send output to a file handle. Most of the time, we\'ve been using STDOUT as the file handle. Because STDOUT is the default, we did not need to specify it. The syntax for the print() function is: print FILE_HANDLE (LIST)

You can see from the syntax that print() is a list operator because it\'s looking for a list of values to print. If you don\'t specify a list, then $ will be used. You can change the default file handle by using the select() function. Let\'s take a look at this:

open(OUTPUT_FILE, ">testfile.dat");

$oldHandle = select(OUTPUT_FILE);

print("This is line 1.\n");

select($oldHandle);

print("This is line 2.\n");

This program displays:

This is line 2.

and creates the TESTFILE.DAT file with a single line in it:

This is line 1.

Perl also has the printf() function which lets you be more precise in how things are printed out. The syntax for printf() looks like this:

printf FILE_HANDLE (FORMAT_STRING, LIST)

Like print(), the default file handle is STDOUT. The FORMAT_STRING parameter controls what is printed and how it looks. For simple cases, the formatting parameter looks identical to the list that is passed to printf(). For example:

$januaryCost = 123.34;

$februaryCost = 23345.45;


printf("January = \$$januaryCost\n");

printf("February = \$$februaryCost\n");

This program displays:

january  = 3.34

February = 345.45

In this example, only one parameter is passed to the printf() function-the formatting string. Because the formatting string is enclosed in double quotes, variable interpolation will take place just like for the print() function.

This display is not good enough for a report because the decimal points of the numbers do not line up. You can use the formatting specifiers shown below:

SpecifierDescription
c Indicates that a single character

should be printed.
s Indicates that a string should

be printed.
d Indicates that a decimal number

should be printed.
u Indicates that an unsigned decimal

number should be printed.
x Indicates that a hexadecimal number

should be printed.
o Indicates that an octal number

should be printed.
e Indicates that a floating point

number should be printed

in scientific notation.
f Indicates that a floating point number

should be printed.
g Indicates that a floating point number

should be printed using

the most space-spacing format, either e or f.

The formats can be modified as follows:

ModifierDescription
- Indicates that the value should be printed left-justified.
# Forces octal numbers to be printed with a leading zero.

Hexadecimal numbers will be printed with a leading 0x.
+ Forces signed numbers to be printed with a leading + or - sign.

Pads the displayed number with zeros instead of spaces.
. Forces the value to be at least a certain width.

An example use of . is: %10.3f

which means that the value will be at least 10 positions wide. And because f is used for floating point, at most 3 positions to the right of the decimal point will be displayed. %.10s will print a string at most 10 characters long.

Returning to our above example, to print the cost variables using format specifiers, we may write print.pl

$januaryCost = 123.34;

$februaryCost = 23345.45;



printf("January = \$%8.2f\n", $januaryCost);

printf("February = \$%8.2f\n", $februaryCost);
This program displays:
January  = $  123.34

February = 345.45

This example uses the f format specifier to print a floating point number. The numbers are printed right next to the dollar sign because $februaryCost is 8 positions width.

If you did not know the width of the numbers that you need to print in advance, you could use the following technique:

  • Create two variables to hold costs for January and February.
  • Find the length of the largest number.
  • Print the cost variables using variable interpolation to determine the width of the numbers to print. Define the max() function.

In Perl we would do, printdemo.pl:

$januaryCost = 123.34;

$februaryCost = 23345.45;

$maxLength = length(max($januaryCost, $februaryCost));


printf("January = \$%$maxLength.2f\n", $januaryCost);

printf("February = \$%$maxLength.2f\n", $februaryCost);


sub max {

my($max) = shift(@_);
foreach $temp (@_) {

$max = $temp if $temp > $max;

}

return($max);

}

This program displays:

January  = $  123.34

February = 345.45

While taking the time to find the longest number is more work, the result is worth it.


 


Learn PHP | Zend Certified Engineer | Zend PHP Pro | PHP Web Apps | Web Hosting Service | Low Cost Domain Names | Great Templates | Great Books | Testimonials | Tech.Articles | TOS | AUS | Home | Linux Apache MySQL PHP