Convert DOC to DOCX using PowerShell

I was tasked with taking a large number of .DOC and .RTF files and converting them to .DOCX. The files were then going to be imported into a SharePoint site. So I went out on the web looking for PowerShell scripts to accomplish this. There are plenty to choose from.

All the examples on the web were the same with some minor modifications. Most of them followed this pattern:

$word = new-object -comobject word.application
$word.Visible = $False
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat],”wdFormatDocumentDefault”);

#Get the files
$folderpath = “c:\doclocation\*”
$fileType = “*doc”

Get-ChildItem -path $folderpath -include $fileType | foreach-object
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
}

#Clean up
$word.quit()

After trying out several I started to convert some test documents. All went well until the files were uploaded to SharePoint. The .RTF files were fine but even though the .DOC fiels were now .DOCX files they did not allow for all the functionality of .DOCX to be used.

After investigating a little further it turns out that when doing a conversion from .DOC to .DOCX the files are left in compatibility mode. The files are smaller, but they don’t allow for things like coauthors.

So back to the drawing board and the web and I found a way to set compatibility mode off. The problem was that it required more steps including saving and reopening the files. In order to use this method I had to add a compatibility mode object:

$CompatMode = [Enum]::Parse([Microsoft.Office.Interop.Word.WdCompatibilityMode], “wdWord2010″)

And then change the code inside the {} from above to:

{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
$converteddoc = get-childitem $savename
$opendoc = $word.documents.open($converteddoc.FullName)$opendoc.SetCompatibilityMode($compatMode);
$opendoc.save()
$opendoc.close()
}

It worked, but I didn’t like it. So back to the web again and this time I stumbled across the real way to do it. Use the Convert method. No one else seems to have used this in any of the examples but it is a much cleaner way to do it then the compatibility mode setting. So this is how I changed my code and now all the files come in to SharePoint as true .DOCX files.

$word = new-object -comobject word.application
$word.Visible = $False
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat],”wdFormatDocumentDefault”);

#Get the files
$folderpath = “c:\doclocation\*”
$fileType = “*doc”

Get-ChildItem -path $folderpath -include $fileType | foreach-object
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$word.Convert()
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
}

#Clean up
$word.quit()

About these ads

6 Responses to Convert DOC to DOCX using PowerShell

  1. Jordan says:

    Thanks a lot for this. I too had this very requirement with needing to migrate a large number of older Office files from shared drives to SharePoint and this was exactly what I needed.

    A couple of comments about the code for other users that might stumble across this article:

    First:
    $word.Convert() needs to be changed to $opendoc.Convert().

    If you don’t do this, you’ll get hit with error messages about the Convert method not being a part of the ApplicationClass. This can be misleading for some because the SaveAs() method still runs successfully, and thus, converts the document to .DOCX. So a user might think this worked anyways, but this actually causes the document to remain stuck in Compatibility Mode, which we’re trying to avoid.

    Second:
    If you run the code and are prompted with a message such as:

    Supply values for the following parameters:
    Process[0]:

    All you need to do is place the first { following the foreach-object command on the same line. For example:

    Get-ChildItem -path $folderpath -include $fileType | foreach-object {

    And one additional note:

    If you need to do this on all .DOC files within numerous subfolders, set your $folderPath variable to the root folder and just add the -recurse parameter to the Get-ChildItem command:

    Get-ChildItem -path $folderpath -include $fileType -recurse | foreach-object {

    Once again, thank you very much for your digging to find out about adding the Convert method :)

    Jordan

  2. Ryan says:

    Hello nice work with the script – it worked for me.

    After converting using the OFC I’m left with converted files but only up to the 2007 level so I’m trying to use your script to take them the rest of the way to 2010 (2013 eventually).

    Is there a way to (like OFC) preserve the Creation/Modfiy Dates along with the file’s Meta Data (Author, Last Saved By, etc) along with this conversion?

  3. Mike says:

    Very Cool ! Thanks a lot Rob – worked great.

    Thanks to Jordan too – I did get the prompt that required putting the “{” on the same line. I also made his $opendoc.convert() change – did not try it before that.

    I had to convert over 400 files and now they are beautiful docx.

  4. Update your post, please

    For me, works using

    $opendoc.Convert();

    NOT $word.Convert();

    Now this references:
    http://kiquenet.wordpress.com/2014/02/20/convert-doc-office-97-2003-to-docx-office-2010/

    Another way is using SetCompatibilityMode

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 111 other followers

%d bloggers like this: