Monday, July 28, 2008

Here Docs in PowerShell

Those of you that are familiar with Unix-style shell scripting or Perl may be familiar with the concept of a HERE doc.  This example from Perl may help explain the name:

my $multiline_string = <<HERE;
This
  is
    my
      string.
HERE


The idea is that once you start the Here Doc, everything you type is taken as part of your string until you tell it that it's time to stop, in this case by putting HERE on its own line.

In PowerShell you can do something similar, using @' and '@ or @" and "@.  Everything from the @' or @" to the '@ or "@ is considered part of the same string.

PS C:\Users\tojo2000\Documents> $multiline_string = @'
>> This
>>   is
>>     my
>>       System.String.
>> '@
>>
PS C:\Users\tojo2000\Documents\> $multiline_string
This
  is
    my
      System.String.

Note that the multiline strings use single and double quotes to control variable interpolation just like regular single and double quotes.  Any variables in a multiline double-quoted string will be expanded.

Why would you want to do this?  There are a lot of situations in which this can make your strings a lot more readable.  As an example, let's say you have a SQL query:

$sql = 'SELECT * FROM employees INNER JOIN parking ON parking.emp_id = employees.emp_id WHERE parking.size = "Compact"';

compare that to this:

$sql = @'
SELECT * FROM employees
INNER JOIN parking
  ON parking.emp_id = employees.emp_id
WHERE parking.size = "Compact"
'@


More Preference Variables

I covered $ErrorActionPreference in the last post, but there are a few other preference variables that you might want to be aware of.  The full list can be found here.


$ConfirmPreference  (default value = 'High')

$ConfirmPreference sets the default level for the -Confirm option on cmdlets.  When a developer makes a cmdlet they have the option of setting the impact level to 'Low', 'Medium', or 'High'.  When a command is run, if the impact level is greater than or equal to the level in $ConfirmPreference, a confirmation prompt appears before the action is executed.  The confirmation prompt will look something like this:

Are you sure you want to perform this action?
Performing operation "Stop-Process" on Target "calc (5852)".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"): 

To suppress confirmation prompts altogether, you can set $ConfirmPreference to $false, which is especially useful when scripting High impact operations.


$OFS (default value = ' ')

$OFS is the Output Record Separator.  Let's say I make an array and then print it, like so: 

PS C:\Users\tojo2000\Documents> $array = ('T', 'O', 'J', 'O')
PS C:\Users\tojo2000\Documents> echo $array
T
O
J
O

Each element of the array was echoed.  Notice what happens if we embed our array in a double-quoted string, though.
 
PS C:\Users\tojo2000\Documents> "My name is $array."
My name is T O J O.

Because the default value of $OFS is a single space, each element of the array is printed with a single space between.  We can set $OFS to be whatever we want, however:

PS C:\Users\tojo2000\Documents> $OFS = ' to the '
PS C:\Users\tojo2000\Documents> "My name is $array."
My name is T to the O to the J to the O.


$DebugPreference (default value = 'SilentlyContinue')
$ProgressPreference (default value = 'Continue')
$VerbosePreference (default value = 'SilentlyContinue')
$WarningPreference (default value = 'Continue')

These preference variables tell PowerShell what to do if it comes across a Debug, Progress (when using the Write-Progress cmdlet, for example), Verbose, or Warning message.  

The possible options are:
  • Inquire - print the message, then prompt to continue
  • Continue - print the message, then continue
  • SilentlyContinue - suppress the message, then continue
  • Stop - error out of the script or cmdlet

Friday, July 25, 2008

A Quick and Easy Progress Bar for Your Scripts.

I just wanted to post this really quickly because I hadn't heard of it.  Have you ever ended up with a script that was iterating through a large list without much feedback and wanted to let the user know that something really was going on in the background and please, please, don't hit Ctrl+C?

In the past I usually took the lazy way out, which is to just print another period every 100 iterations or so.  I just came across a cmdlet called Write-Progress that comes in handy here.

Check out the following code:

foreach ($index in (1..1000)) {
  Write-Progress -activity 'Destroying Earth' -status 'Obliterating...' -percentComplete ($index / 10)
}





Wednesday, July 23, 2008

Handling Errors in PowerShell

Powershell gives you a few ways to handle errors in your scripts.  The most powerful tool in your arsenal is the exception trap, I'm sure, but it's a bit complicated and I can't claim to be a master of PowerShell exceptions, but I'll go over that too.  We'll start off with something you'll use a lot more though, even though you may not know it.


PowerShell has a number of Preference Variables that you can use to determine the way it behaves.  If you have the v2 CTP version installed then you can run 'help about_Preference_Variables' to see the list, but for the rest of us use the link above.  $ErrorActionPreference sets the way PowerShell will respond when hitting a non-terminating error.  This won't affect errors that terminate a script.  

The allowable values for $ErrorActionPreference are 'Continue' (default), 'SilentlyContinue', 'Inquire', and 'Stop'.

Which errors are terminating and which aren't?  Well there's no definitive list, but take this example to show you how it works:

PS HKLM:\> dir


   Hive: Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE

SKC  VC Name                           Property
---  -- ----                           --------
  7   4 COMPONENTS                     {StoreFormatVersion, StoreArchitecture, PublisherPolicyChangeTime, LastScavengeCookie}
  4   0 HARDWARE                       {}
  1   0 SAM                            {}
Get-ChildItem : Requested registry access is not allowed.
At line:1 char:3
+ dir <<<<
 17   0 SOFTWARE                       {}
  9   0 SYSTEM                         {}

______________________________________________________

PS HKLM:\> $ErrorActionPreference = 'SilentlyContinue'
PS HKLM:\> dir


   Hive: Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE

SKC  VC Name                           Property
---  -- ----                           --------
  7   4 COMPONENTS                     {StoreFormatVersion, StoreArchitecture, PublisherPolicyChangeTime, LastScavengeCookie}
  4   0 HARDWARE                       {}
  1   0 SAM                            {}
 17   0 SOFTWARE                       {}
  9   0 SYSTEM                         {}

______________________________________________________

PS HKLM:\> $ErrorActionPreference = 'Inquire'
PS HKLM:\> dir


   Hive: Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE

SKC  VC Name                           Property
---  -- ----                           --------
  7   4 COMPONENTS                     {StoreFormatVersion, StoreArchitecture, PublisherPolicyChangeTime, LastScavengeCookie}
  4   0 HARDWARE                       {}
  1   0 SAM                            {}

Confirm
Requested registry access is not allowed.
[Y] Yes  [A] Yes to All  [H] Halt Command  [S] Suspend  [?] Help (default is "Y"): y
Get-ChildItem : Requested registry access is not allowed.
At line:1 char:3
+ dir <<<<
 17   0 SOFTWARE                       {}
  9   0 SYSTEM                         {}


As you can see, the default is to print an error and continue, while 'SilentlyContinue' will suppress error messages, which is similar to the old 'On Error Resume Next' in VBScript.

One nice side-effect of having this Preference Variable is that you can change the behavior for just one code block or function within your script.  Once the variable inside of the block goes out of scope, $ErrorActionPreference reverts to the original value.

PS HKLM:\> function suppress_errors ([string]$path) {
>> $ErrorActionPreference = 'SilentlyContinue'
>>  echo "`$ErrorActionPreference = $ErrorActionPreference"
>>}
>>
PS HKLM:\> echo "`$ErrorActionPreference = $ErrorActionPreference"
$ErrorActionPreference = Continue
PS HKLM:\> suppress_errors
$ErrorActionPreference = SilentlyContinue
PS HKLM:\> echo "`$ErrorActionPreference = $ErrorActionPreference"
$ErrorActionPreference = Continue


NOTE: You can also use the -ErrorAction (or -ea) switch that is ubiquitous across cmdlets with the same values if you want to change PowerShell's behavior for a particular command.

Sunday, July 20, 2008

The Trouble with Piping

Update:  I was wrong, and I'm really happy about it.

Powershell piping works slightly different than I thought.  As each result is returned it is passed down the pipe, rather than waiting for the entire section to complete.  My earlier tests didn't show what I thought they did, and that was my mistake.  

The function I came up with for shortcircuiting on a recursive file search when a matching file is found at the bottom of this poast, but here is another way to do the same thing based on the rest of the the thread that is much simpler.

Note:  there is one last caveat to watch out for:  if you put parentheses around some code, that code will be executed in its entirety in order to evaluate it before continuing, so keep that in mind when piping commands if you are planning on short-circuiting the process early, as in these examples.

# Find-File($path, $fileglob)
# Returns the first file that matches the fileglob.
# Args:
#   $path: [string] The path to the directory to start 
#     searching under
#   $fileglob: [string] The filename pattern to match 
#     (same format as dir)
#
# Returns:
#   The file object of the first matching file.

# New Version
function First-File([string]$path, [string]$fileglob){ 
    dir $path -include $fileglob -r | foreach {Write-Output $_; continue}
}

# Old Version
function Find-File ([string]$path, [string]$filename) { 
  $files = @(dir $path) 

  foreach ($file in $files) { 
    if (-not ($file.Mode -match '^d') -and ($file.Name -like "$filename")) { 
      return $file 
    } 
  } 

  foreach ($file in $files) { 
    if ($file.Mode -match '^d') { 
      $result = (Get-File $file.FullName $filename) 

      if ($result) { 
        return $result 
      } 
    } 
  }
}

Tuesday, July 15, 2008

Arrays in PowerShell

Sometimes PowerShell weirds me out a little bit because some parts look a lot like Perl, but don't act like it. There are a lot of bits and pieces strewn about the Web about how to use arrays and hashtables (I'll get to those later), but I had to look up several of them to get all of the information I wanted, so I'll lay them out here.

Arrays

An array is an ordered list of variables that can be accessed via their index.  It's a lot simpler than it sounds.  First, there are two ways to explicitly create an array:

PS C:\> [array]$my_list = (1, 7, 3, 19, 24, 2)
PS C:\> $my_list = @(1, 7, 3, 19, 24, 2)

Most people will just use the @() format.  This tells PowerShell to return whatever's between the parentheses as an array (even if there's only one member).

So let's take a look at our new array:

PS C:\> $my_list
1
7
3
19
24
2

PS C:\> $my_list.Length
6

PS C:\> $my_list[2]
3

Note that on that last one I used [2] at the end of the array object to denote that I want to grab the third object in the array.  The code below shows what the array looks like:

PS C:\> foreach ($index in (0 .. ($my_list.Length - 1))) {
>>  echo ("`$my_list[$($index)] = $($my_list[$index])")
>> }
>>
$my_list[0] = 1
$my_list[1] = 7
$my_list[2] = 3
$my_list[3] = 19
$my_list[4] = 24

Note the '..' operator above.  This is a handy range operator, especially when working with arrays.  When I'm reading it in my head, I think of it as the word 'through', as in:

PS C:\> $count = (1..10)

Which would read: "$count equals one through 10".  This creates an array where the first 10 elements are the numbers 1 through 10.

Let's say I only want the first three elements of the array:

PS C:\> $count[0..2]
1
2
3

Or maybe I only want elements 1, 4, and 5

PS C:\> $count[1, 4, 5]
2
5
6

Here's a slightly trickier one:  Maybe I only want the last three elements.  By using negative numbers, you can count from the last element backwards:

PS C:\> $count[-1 .. -3]
10
9
8

Now what if I want everything from the second element on?  Naturally I would think that this would work:

PS C:\> $count[1 .. -1]
2
1
10

That didn't do what I expected.  It seems to have counted backwards, giving me the second element, but instead counting ahead until it reached the -1 element, it counts backwards to reach -1, so instead of getting $count[1, 2, 3, 4, 5, 6, 7, 8, 9], I got $count[1, 0, -1].

We can do it, though; we have the technology:

PS C:\> $count[1 .. ($count.Length - 1)]
2
3
4
5
6
7
8
9
10

The Length or Count property of an array will give us the total number of elements in the array, but we have to subtract 1 because because arrays start with an index of 0.

One last thing before I wrap this up:  remember $my_list?  It's all out of order, so let's sort it.

$my_list = $my_list | sort

There.  Much better.

Tuesday, July 8, 2008

SQL Server 2008 PowerShell Support Controversy

As this article from concentratedtech.com shows, some people are a little upset with the SQL Server team.  The controversy occurs because they've opted to use a custom MiniShell.  I'd never heard of a MiniShell, but now I suspect that we'll see more of these.

Don Jones sums up the detractors' position pretty well:
But they have created a SQL Server-specific version of Windows PowerShell, which is what you use if you want to use SQL Server 2008 cmdlets or the new provider. And this new console is closed - you cannot add more functionality to it by using Add-PSSnapin. 
To say that this is “stupid” would be unkind to all the stupid people out there. This “closed console” is stupid on a level heretofore unknown. The SQL team is basically taking a great, all-purpose administrative tool and locking it down. They’re returning us to the good old days of Windows NT, when everything was administered by its own, separate, non-integrated admin tool. THIS SUCKS. 
Well it's all Jeffrey Snover's fault (see his full post here), but he's got some solid reasoning behind his decisions:
During the Vista reset, there was a great deal of anxiety about .NET versioning and general angst about instability arising from plugin models where code coming from the various sources run in the same process.  Imagine the case where you load plugins from 10 different sources and then something goes wrong - who is responsible?

...

The problem is not that SQL shipped a MiniShell but rather that there are SQL UX scenarios that use the MiniShell instead of a general purpose PowerShell.  The SQL Management Studio GUI has context menus which launch their MiniShell.  This is where we made a mistake.  By definition, this is an escape out to an environment to explore and/or perform ad hoc operations.  This environment does not benefit from the tight production promises that a MiniShell provides, in fact it is hampered by them.
He goes on to say that the issue will be rectified in a future release (SQL Server 2008 is only a Community Technology Preview now).

This brings up interesting possibilities, though.  I wonder how easy/hard it is to create a MiniShell.



Don't Believe Everything You Read

I made a boo-boo on my previous post about importing scripts as libraries.  In researching the dotsource operator I picked up some misinformation.  My brain was nagging at me that I might have gotten the wrong information already, but my tests had seemed to be working.

The dotsource operator does not run the script in the global scope.  It creates all variables and functions in the current scope.  If you're running a script at the command-line then this has the effect of running the script in the global scope, but in my example profile I made a mistake.  Using the dotsource operator where I put it ends up creating the functions inside the scope of the scriptblock that it is called in, not the global scope.

As a temporary measure I made the scripts that I want to import create their functions in the global scope, but that's just a temporary condition, dig*?

How did I make that mistake?  I forgot that I had a script in my path with the same name as the function I thought I was importing, making it look like the script had imported the function into the global scope.


* Sorry, every once in a while I channel George Clinton when I type.

Monday, July 7, 2008

Free Powershell Workbooks

I stumbled on this link on the Scripting With Windows Powershell site on Microsoft's website.

You can download the workbooks on the blog of one of Microsoft's Infrastructure Architects, Frank Koch, at Microsoft Switzerland.

Topics include an introduction to PowerShell and Windows Administration scripting with PowerShell.

Sunday, July 6, 2008

UPDATE: Importing Scripts as Libraries, Part Deux

I forgot to mention that the most convenient place to put this function is in your profile. What's that? You don't have a profile? Well What are you watiing for?

Make a folder inside your Documents folder called WindowsPowerShell. Inside that folder create a script called Microsoft.PowerShell_profile.ps1. Think of it as .bashrc for PowerShell.

I've uploaded my profile as an example here.

Also, note that I'm searching under the path pointed to by the PSPATH environment variable. I have it set to C:\PSLibs in my profile.

Importing Scripts as Libraries, Part Deux

In this post I showed how using the dotsource operator lets us run a script and import all of its named objects from the script scope into the global scope. The effect is as if instead of calling a script you just typed its contents at the prompt.

This set off a lightbulb in my head, and gave me the solution to something I'd been pondering for a while: how to import scripts as libraries (modules, etc, whatever we're calling them these days). You see, if I come up with a really neat function I don't want to have to cut and paste it into every script that uses it, and if I come up with a lot of content, I'd like to be able to share them with people.

This is what I came up with, and it incorporates a few new tricks I picked up along the way to trying to solve this problem. The desired result: I can type in 'import-script scriptname.ps1' and it is automatically imported without having to know the full path to the file.

Some highlights:
  • Test-Path is a nice cmdlet for checking the existence of a path on any PSDrive, so that means any Enviornment variable, registry key, file object, etc.
  • Exceptions work, but are really hard to get straight. Check out this link for some details.
  • The (, (command)) syntax is how I'm forcing the variable to be an array. I might get rid of it later, I think now that I'm using boolean comparison instead of looking for the length of the array to check if any files were found.
  • In most examples people use Write-Error instead of echo to print error messages, but I wanted to have the error show up clearly without all of the red words and extra garbage.
  • Note that I'm searching everything under the folder pointed to by the PSPATH environment variable. I point mine to C:\PSLibs by default.

function Import-Script ([string]$script) {
  $local:ReportErrorShowSource = 0

  # Clean up the Exception messages a bit
  trap [Exception] {
    $errmsg = "`n`tError importing '$script': "
    $errmsg += ($_.Exception.Message + "`n")
echo $errmsg
break
  }

  # Check PSPATH
  if (-not (Test-Path "Env:pspath")) {
    throw "PSPath environment variable not set."
  }elseif (-not (Test-Path $env:pspath)) {
    throw "PSPATH environment variable points to invalid path." 
  }

  $files = (, (dir $env:pspath $script -Recurse))

  # Make sure we find a single matching file
  if ($files.Length -gt 1) {
    throw ([string]$files.Length + " files of name '$script' found in PSPATH.")
  }elseif (-not $files) {
    throw "No files named '$script' found in PSPATH."
  }

  # Do the needful
  . $files[0].FullName
}

The DotSource Operator

A while back I asked Stephen Ng if he had seen any documentation on importing scripts as modules or libraries, and he hadn't. I sort of filed it away for future reference, knowing that if we went down this path we'd probably want to find some way to keep a repository of PowerShell scripts for common use.

Fast forward to a few days ago, when the latest entry from The PowerShell Guy showed up in my Google Reader. In his Get-IpConfig function (which I'll discuss in a later post) he has a line with an open parenthesis followed by a dot.wasn't the most readable line of code in the world anyway, but as I was mentally parsing it I hit a snag. I had no idea what that dot was doing just after the open parenthesis.

As it turns out, this is what's known as the dotsource operator. The dotsource operator (I wish people would stop and think a bit when choosing these terrible names) is like the ampersand (&) operator, in that it executes the script or code block that comes after it, but it also declares all variables in the global scope. Why would you want to do this?

Let's use a terrible example and say I have a script that will get a list of users and groups on a computer. I have a script, Get-UsersAndGroups.ps1. It creates two variables, $users and $groups, and exits. I want to be able to use those variables, so I run the following:

PS C:\> .\Get-UsersAndGroups.ps1
PS C:\> $users

I get nothing back, even though I know I created the variable. $users was created in the script's scope, though, and once the script finishes running it takes its variables with it.

I have three options here:
  1. Change my script to send the variables to the output pipeline. Maybe this script is being used by other people, though, so I don't necessarily want to change the pipeline.
  2. Explicitly declare the variables as $global:users and $global:groups in the script. The problem with this is that I don't necessarily always want to declare these variables in the global namespace.
  3. Use the dotsource operator.
The way to use the dotsource operator is like this:

PS C:\> . .\Get-UsersAndGroups.ps1
PS C:\> $users
Tim
Administrator
Alfred E. Neuman

(Note that there is a space between the dots there, otherwise I'd just be telling PowerShell to look in the parent directory.)

To wrap this up, we can take all of this newfound knowledge and see how you can create libraries in PowerShell. By dotsourcing (yes, people even use it as a verb, ugh) a script with functions and constants in it, we can import the functions in a way that makes them usable by our scripts, that way we can reuse the code.

The one big problem is that PowerShell has no concept of a /lib directory that it looks for modules in by default, so you must know the path to the script you will be importing. I'll leave that as a future exercise, but I have a spark of an idea in my head of how we can create a function that looks up a script by its name to import and distribute it using a common company Profile so that all company PowerShell Scripters have a common location for libraries, like sitecustomize.py in Python.

How to Annoy Your Co-Workers (a.k.a. More Fun with .NET)

In this post by The PowerShell Guy (add it to your Google Reader if you haven't already), an annoying little prank is proposed for cheesing fools who leave their screens unlocked. It moves the mouse cursor to the top-left corner of the screen every 5 seconds or so.

I thought it sounded funny, so I typed it into PowerShell and got this:

PS C:\> do {[Windows.Forms.Cursor]::Position = "1,1";sleep 5}until (0)
Unable to find type [Windows.Forms.Cursor]: make sure that the assembly containing this type is loaded.
At line:1 char:27

So what did I do wrong? I thought at first that I'd misspelled it, but that wasn't the case. I typed it into PowerGUI ScriptEditor and ran the script, and it ran just fine.

I'd forgotten that not every .NET namespace is available in Powershell by default. Instead, you sometimes need to use this command:

PS C:\> [System.Reflection.Assembly]::LoadWithPartialName("System.Windows.Forms")

If you did it right, you'll get an acknowledgement with the name of the .dll file that was loaded. If you misspell the namespace, you get no response at all, and no error (why?!?).

GAC Version Location
--- ------- --------
True v2.0.50727 C:\Windows\assembly\GAC_MSIL\System.Windows.Forms\2.0.0.0__b77a5c561934e089\System.Windows.Forms.dll

I figured that was a lot to type in at once, so I created another script, Load-Assembly.ps1:

[System.Reflection.Assembly]::LoadWithPartialName($ARGS[0])

Then I created an alias to that script called "load". Now when I want to import an assembly, I can just do this:

load System.Windows.Forms

It will make pwning my next victim so much easier. >:)

Fun and Excitement with Variable Scoping!!!

Since PowerShell's syntax in scripts reminds me of Perl, my fingers keep trying to type the word "my" before I set every variable (you Perl geeks know what I'm talking about). Anyway, that got me thinking about scoping, which is amazing and exciting, and the most fun topic in the world.

Okay, so I was being sarcastic, but it is pretty important, and it turns out that there is a very good article on scoping right here. I'll try to break it down to a shorter, simpler version for the purposes of the blog, so here goes:

Let's say I do this:

PS C:\> $x = "Global"
PS C:\> function print_var {echo $x}
PS C:\> print_var

You'd expect it to output the string "Global", right?

What if I do this?

PS C:\> echo $x
PS C:\> function set_var {$x = "Local"; echo $x}
PS C:\> set_var
PS C:\> echo $x

You might expect the output to be

Global
Local
Local

but it's not. The output is

Global
Local
Global

That's because by default each variable is set in the local scope. In the case of the function set_var, a new variable named $x is created inside the script block that masks the global $x variable that was set outside of it. When PowerShell looks up the value of a variable, it starts in the local scope and then works its way up, to the parent scope, and then its parent, and so on until it reaches the global scope. If you accidentally create a new variable with the same name at a lower scope, PowerShell won't care that you also have a global variable with that name because it will stop looking as soon as it finds one.

The article I referenced earlier has some neat tricks you can do to set scope, and you can look up the help for Set-Variable for more info.

As a general rule it is best to just not set the value of a global variable inside a script block. It is better to return the value of the local variable as the output of the function and use that to set the variable in the parent scope. If you must set a global variable inside a script block, though, there are two ways to do it:

#1

PS C:\> echo $x
PS C:\> function set_var {$global:x = "Local"; echo $x}
PS C:\> set_var
PS C:\> echo $x

#2

PS C:\> echo $x
PS C:\> function set_var {Set-Variable -scope global "Local"; echo $x}
PS C:\> set_var
PS C:\> echo $x

There are two easy rules that apply to any language and will help mitigate this issue without having to think about it too much, though:

Rule #1: Use as few global variables as absolutely necessary.
Rule #2: Never use generic names for variables like $x.


Setting up a Profile in PowerShell

Profiles in PowerShell are like "rc files" for Linux commands. The profile is a PowerShell script that is run every time you invoke the interpreter. You can use it to set up aliases, create default variables, or anything else that you can do in a PowerShell Script.

Profiles can be per-machine, per-user, or per-shell. See this website for the full details.

To set up your personalized profile for your user account, create the following file:

$env:userprofile\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1

Here's the profile I set up for myself:

# PROFILE START ###########################

# ALIASES
set-alias subst 'subst-output.ps1'
set-alias match 'get-match.ps1'
set-alias ex 'explorer.exe'

# ENVIRONMENT
$env:pathext = (".PS1;" + $env:pathext)

# PATH
cd "$env:userprofile\Documents"

# PROFILE END #############################

Using the .NET WebClient to Scrape Web Pages

.NET comes with a nifty little class called System.Net.WebClient that lets you easily interact with a web page.

To play with it I decided to scrape the output of this page that generates Shakespearean insults and grab just the insult from the output, giving me easy command-line access to random Shakespearean insults (something I often find myself in need of, to be sure).

# Retrieves a random Shakespearean insult from the Internet.
#
# Author: Tojo2000 <tojo2000@tojo2000.com>
# (c)2008 All Rights Reserved
#
# Usage: get-insult.ps1

$regex = New-Object System.Text.RegularExpressions.Regex('\n([^<>]+)\n');

$web_client = New-Object System.Net.WebClient;
$web_client.Headers.Add("user-agent", "PowerWeb");

$data = $web_client.DownloadString("http://www.pangloss.com/seidel/Shaker/index.html");

if ($match = $regex.Match($data)) {
  echo $match.Groups[1].Value;
}

Note: I'm not affiliated with this website, so obviously don't abuse it.  It's just an example

Using .NET

I just want to give you guys a quick example of how easy it is to use .NET in your PowerShell scripts. Let's say Alfred E. Neuman has quit The Company, and I want to take all of his oncall shifts. In this scenario I have a text file called oncall.txt:

PS C:\Users\timjohnson\Documents> cat oncall.txt
20080613 timjohnson
20080614 bfong
20080615 mherzog
20080616 aneuman
20080617 timjohnson
20080618 bfong
20080619 mherzog
20080620 aneuman
20080621 timjohnson
20080622 bfong

I can create a regular expression object that will do the dirty work of replacing his name with mine like this:

PS C:\Users\timjohnson\Documents> $regex = new-object System.Text.RegularExpressions.Regex "\baneuman\b"
PS C:\Users\timjohnson\Documents> cat oncall.txt | foreach {$regex.Replace($_, "timjohnson")}
20080613 timjohnson
20080614 bfong
20080615 mherzog
20080616 timjohnson
20080617 timjohnson
20080618 bfong
20080619 mherzog
20080620 timjohnson
20080621 timjohnson
20080622 bfong

Powershell not only let me use $regex as a System.Text.RegularExpressions.Regex object, but it even lets me do tab completion for method names. Sweet.

Piping and PSDrives

Two of the oft-touted features of PowerShell are the idea that you can pipe objects from one cmdlet to the next and PSDrives.

In bash or cmd, when you pipe the output of one command to another, it sends the text that is printed to STDOUT to the STDIN of the next process. This means that there are a lot of commands just to help you massage and manipulate the text on the screen so that you can get the right output.

In PowerShell, the objects themselves are passed from one cmdlet to the next. Here's an example:

If I type

cd HKLM:

I am now at the root of the HKEY_LOCAL_MACHINE, since HKLM: is a PSDrive set up so that I can navigate the registry just like a drive on my computer. You can see the whole list of PSDrives by typing in "psdrive" at the command prompt.

If I type

dir

I get this:

PS HKLM:\> dir


Hive: Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE

SKC VC Name Property
--- -- ---- --------
7 3 COMPONENTS {StoreFormatVersion, StoreArchitecture, PublisherPolicyChangeTime}
4 0 HARDWARE {}
1 0 SAM {}
042 0 Schema {}
15 0 SOFTWARE {}
9 0 SYSTEM {}

(I'm on Vista running in non-administrator mode, so that's why I don't have access to the schema)

This appears to be sorted by the Name attribute, but let's say I don't want to sort on the Name attribute, I'm more interested in the SKC attribute (whatever that is):

PS HKLM:\> dir | sort SKC


Hive: Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE

SKC VC Name Property
--- -- ---- --------
042 0 Schema {}
15 0 SOFTWARE {}
9 0 SYSTEM {}
7 3 COMPONENTS {StoreFormatVersion, StoreArchitecture, PublisherPolicyChangeTime}
4 0 HARDWARE {}
1 0 SAM {}


I have the data that I want, but I want to display it differently:


PS HKLM:\> dir | sort SKC | format-list
Get-ChildItem : Requested registry access is not allowed.
At line:1 char:4
+ dir <<<< | sort SKC | format-list


Name : Schema
ValueCount : 0
Property : {}
SubKeyCount : 1042

Name : SOFTWARE
ValueCount : 0
Property : {}
SubKeyCount : 15

Name : SYSTEM
ValueCount : 0
Property : {}
SubKeyCount : 9

Name : COMPONENTS
ValueCount : 3
Property : {StoreFormatVersion, StoreArchitecture, PublisherPolicyChangeTime}
SubKeyCount : 7

Name : HARDWARE
ValueCount : 0
Property : {}
SubKeyCount : 4

Name : SAM
ValueCount : 0
Property : {}
SubKeyCount : 1

If you want to pipe a non-cmdlet, then it will send the text output to the pipe just like you're used to, but you can manipulate it easily in PowerShell. $_ is a special "default" variable for the input, similar to $_ in Perl. This lets you do things like this:

cat xunlei.txt | where {$_ -match '\sBEJ-73716507\s'}
cat xunlei.txt | where {$_ -match '\sBEJ-73716507\s'} | foreach {echo("Found it!!! " + $_)}

The first command pipes the output of cat (cat and echo are aliases for Write-Output) to the where statement that filters out everything that doesn't match the regular expression, kind of like grep. The second command does the same, but then prepends the string "Found it!!! " to the beginning of each line before printing it to the screen.


NOTE: When using the + operator to concatenate strings it is usually necessary to put parentheses around the whole string. Otherwise the + operator uses some funky overloaded function and it won't do what you expect.