- 积分
- 24557
- 明经币
- 个
- 注册时间
- 2004-3-17
- 在线时间
- 小时
- 威望
-
- 金钱
- 个
- 贡献
-
- 激情
-
|
本帖最后由 作者 于 2009-5-17 12:35:37 编辑
一、在AutoCAD使用异步工作流简化并发程序设计(F#)
January 25, 2008
Using F# Asynchronous Workflows to simplify concurrent programming in AutoCAD
In the last post we saw some code that downloaded data - serially - from a number of websites via RSS and created AutoCAD entities linking to the various posts.
As promised, in today's post we take that code and enable it to query the same data in parallel by using Asynchronous Workflows in F#. Asynchronous Workflows are an easy-to-use yet powerful mechanism for enabling concurrent programming in F#.
Firstly, a little background as to why this type of technique is important. As many - if not all - of you are aware, the days of raw processor speed doubling every couple of years are over. The technical innovations that enabled Moore's Law to hold true for half a century are - at least in the area of silicon-based microprocessor design - hitting a wall (it's apparently called the Laws of Physics :-). Barring some disruptive technological development, the future gains in computing performance are to be found in the use of parallel processing, whether via multiple cores, processors or distributed clouds of computing resources.
Additionally, with an increasing focus on distributed computing and information resources, managing tasks asynchronously becomes more important, as information requests across a network inevitably introduce a latency that can be mitigated by the tasks being run in parallel.
The big problem is that concurrent programming is - for the most-part - extremely difficult to do, and even harder to retro-fit into existing applications. Traditional lock-based parallelism (where locks are used to control access to shared computing resources) is both unwieldy and prone to blocking. New technologies, such as Asynchronous Workflows and Software Transactional Memory, provide considerable hope (and this is a topic I have on my list to cover at some future point).
Today's post looks at a relatively simple scenario, in the sense that we want to perform a set of discrete tasks in parallel, harnessing those fancy multi-core systems for those of you lucky enough to have them (I'm hoping to get one when I next replace my notebook, sometime in March), but that these tasks are indeed independent: we want to wait until they are all complete, but we do not have the additional burden of them communicating amongst themselves or using shared resources (e.g. accessing shared memory) during their execution.
We are also going to be very careful only to run parallel tasks unrelated to AutoCAD. Any access made into AutoCAD's database, for instance, needs to be performed in series: AutoCAD is not thread-safe when it comes to the vast majority of its programmatically-accessible functionality. So we're going to run a set of asynchronous, parallel tasks to query our various RSS feeds, and combine the results before creating the corresponding geometry in AutoCAD. This all sounds very complex, but the good (actually great) news is that Asynchronous Workflows does all the heavy lifting. Phew.
Here's the modified F# code- // Use lightweight F# syntax
- #light
- // Declare a specific namespace and module name
- module MyNamespace.MyApplication
- // Import managed assemblies
- #I @"C:\Program Files\Autodesk\AutoCAD 2008"
- #r "acdbmgd.dll"
- #r "acmgd.dll"
- open Autodesk.AutoCAD.Runtime
- open Autodesk.AutoCAD.ApplicationServices
- open Autodesk.AutoCAD.DatabaseServices
- open Autodesk.AutoCAD.Geometry
- open System.Xml
- open System.Collections
- open System.Collections.Generic
- open System.IO
- open System.Net
- open Microsoft.FSharp.Control.CommonExtensions
- // The RSS feeds we wish to get. The first two values are
- // only used if our code is not able to parse the feed's XML
- let feeds =
- [ ("Through the Interface",
- "http://blogs.autodesk.com/through-the-interface",
- "http://through-the-interface.typepad.com/through_the_interface/rss.xml");
-
- ("Don Syme's F# blog",
- "http://blogs.msdn.com/dsyme/",
- "http://blogs.msdn.com/dsyme/rss.xml");
-
- ("Shaan Hurley's Between the Lines",
- "http://autodesk.blogs.com/between_the_lines",
- "http://autodesk.blogs.com/between_the_lines/rss.xml");
-
- ("Scott Sheppard's It's Alive in the Lab",
- "http://blogs.autodesk.com/labs",
- "http://labs.blogs.com/its_alive_in_the_lab/rss.xml");
-
- ("Lynn Allen's Blog",
- "http://blogs.autodesk.com/lynn",
- "http://lynn.blogs.com/lynn_allens_blog/index.rdf");
- ("Heidi Hewett's AutoCAD Insider",
- "http://blogs.autodesk.com/autocadinsider",
- "http://heidihewett.blogs.com/my_weblog/index.rdf") ]
- // Fetch the contents of a web page, asynchronously
- let httpAsync(url:string) =
- async { let req = WebRequest.Create(url)
- use! resp = req.GetResponseAsync()
- use stream = resp.GetResponseStream()
- use reader = new StreamReader(stream)
- return reader.ReadToEnd() }
- // Load an RSS feed's contents into an XML document object
- // and use it to extract the titles and their links
- // Hopefully these always match (this could be coded more
- // defensively)
- let titlesAndLinks (name, url, xml) =
- let xdoc = new XmlDocument()
- xdoc.LoadXml(xml)
- let titles =
- [ for n in xdoc.SelectNodes("//*[name()='title']")
- -> n.InnerText ]
- let links =
- [ for n in xdoc.SelectNodes("//*[name()='link']") ->
- let inn = n.InnerText
- if inn.Length > 0 then
- inn
- else
- let href = n.Attributes.GetNamedItem("href").Value
- let rel = n.Attributes.GetNamedItem("rel").Value
- if href.Contains("feedburner") then
- ""
- else
- href ]
-
- let descs =
- [ for n in xdoc.SelectNodes
- ("//*[name()='description' or name()='content' or name()='subtitle']")
- -> n.InnerText ]
- // A local function to filter out duplicate entries in
- // a list, maintaining their current order.
- // Another way would be to use:
- // Set.of_list lst |> Set.to_list
- // but that results in a sorted (probably reordered) list.
- let rec nub lst =
- match lst with
- | a::[] -> [a]
- | a::b ->
- if a = List.hd b then
- nub b
- else
- a::nub b
- | [] -> []
- // Filter the links to get (hopefully) the same number
- // and order as the titles and descriptions
- let real = List.filter (fun (x:string) -> x.Length > 0)
- let lnks = real links |> nub
- // Return a link to the overall blog, if we don't have
- // the same numbers of titles, links and descriptions
- let lnum = List.length lnks
- let tnum = List.length titles
- let dnum = List.length descs
-
- if tnum = 0 || lnum = 0 || lnum <> tnum || dnum <> tnum then
- [(name,url,url)]
- else
- List.zip3 titles lnks descs
- // For a particular (name,url) pair,
- // create an AutoCAD HyperLink object
- let hyperlink (name,url,desc) =
- let hl = new HyperLink()
- hl.Name <- url
- hl.Description <- desc
- (name, hl)
- // Use asynchronous workflows in F# to download
- // an RSS feed and return AutoCAD HyperLinks
- // corresponding to its posts
- let hyperlinksAsync (name, url, feed) =
- async { let! xml = httpAsync feed
- let tl = titlesAndLinks (name, url, xml)
- return List.map hyperlink tl }
- // Now we declare our command
- [<CommandMethod("rss")>]
- let createHyperlinksFromRss() =
-
- // Let's get the usual helpful AutoCAD objects
-
- let doc =
- Application.DocumentManager.MdiActiveDocument
- let db = doc.Database
- // "use" has the same effect as "using" in C#
- use tr =
- db.TransactionManager.StartTransaction()
- // Get appropriately-typed BlockTable and BTRs
- let bt =
- tr.GetObject
- (db.BlockTableId,OpenMode.ForRead)
- :?> BlockTable
- let ms =
- tr.GetObject
- (bt.[BlockTableRecord.ModelSpace],
- OpenMode.ForWrite)
- :?> BlockTableRecord
-
- // Add text objects linking to the provided list of
- // HyperLinks, starting at the specified location
-
- // Note the valid use of tr and ms, as they are in scope
- let addTextObjects pt lst =
- // Use a for loop, as we care about the index to
- // position the various text items
- let len = List.length lst
- for index = 0 to len - 1 do
- let txt = new DBText()
- let (name:string,hl:HyperLink) = List.nth lst index
- txt.TextString <- name
- let offset =
- if index = 0 then
- 0.0
- else
- 1.0
- // This is where you can adjust:
- // the initial outdent (x value)
- // and the line spacing (y value)
- let vec =
- new Vector3d
- (1.0 * offset,
- -0.5 * (Int32.to_float index),
- 0.0)
- let pt2 = pt + vec
- txt.Position <- pt2
- ms.AppendEntity(txt) |> ignore
- tr.AddNewlyCreatedDBObject(txt,true)
- txt.Hyperlinks.Add(hl) |> ignore
- // Here's where we do the real work, by firing
- // off - and coordinating - asynchronous tasks
- // to create HyperLink objects for all our posts
- let links =
- Async.Run
- (Async.Parallel
- [ for (name,url,feed) in feeds ->
- hyperlinksAsync (name,url,feed) ])
- // Add the resulting objects to the model-space
- let len = Array.length links
- for index = 0 to len - 1 do
- // This is where you can adjust:
- // the column spacing (x value)
- // the vertical offset from origin (y axis)
- let pt =
- new Point3d
- (15.0 * (Int32.to_float index),
- 30.0,
- 0.0)
- addTextObjects pt (Array.get links index)
-
- tr.Commit()
A few comments on the changes:
Lines 57-62 define our new httpAsync() function, which uses GetResponseAsync() - a function exposed in F# 1.9.3.7 - to download the contents of a web-page asynchronously [and which I stole shamelessly from Don Syme, who presented the code last summer at Microsoft's TechEd].
Lines 141-144 define another asynchronous function, hyperlinksAsync(), which calls httpAsync() and then - as before - extracts the feed information and creates a corresponding list of HyperLinks. This is significant: creation of AutoCAD HyperLink objects will be done on parallel; it is the addition of these objects to the drawing database that needs to be performed serially.
Lines 214-217 replace our very simple "map" with something slightly more complex: this code runs a list of tasks in parallel and waits for them all to complete before continuing. What is especially cool about this implementation is the fact that exceptions in individual tasks result in the overall task failing (a good thing, believe it or not :-), and the remaining tasks being terminated gracefully.
Lines 221 and 233 change our code to handle an array, rather than a list (while "map" previously returned a list, Async.Run returns an array).
When run, the code creates exactly the same thing as last time (although there are a few more posts in some of the blogs ;-)
A quick word on timing: I used "F# Interactive" to do a little benchmarking on my system, and even though it's single-core, single-processor, there was a considerable difference between the two implementations. I'll talk more about F# Interactive at some point, but think of it to F# in Visual Studio as the command-line is to LISP in AutoCAD: you can very easily test out fragments of F#, either by entering them directly into the F# Interactive window or highlighting them in Visual Studio's text editor and hitting Alt-Enter.
To enable function timing I entered "#time;;" (without the quotations marks) in the F# Interactive window. I then selected and loaded the supporting functions needed for each test - not including the code that adds the DBText objects with their HyperLinks to the database, as we're only in Visual Studio, not inside AutoCAD - and executed the "let links = ..." assignment in our two implementations of the createHyperlinksFromRss() function (i.e. the RSS command). These functions do create lists of AutoCAD HyperLinks, but that's OK: this is something works even outside AutoCAD, although we wouldn't be able to do anything much with them. Also, the fact we're not including the addition of the entities to the AutoCAD database is not relevant: by then we should have identical data in both versions, which would be added in exactly the same way.
Here are the results:- I executed the code for serial querying and parallel querying twice (to make sure there were no effects from page caching on the measurement):
- val links : (string * HyperLink) list list
- Real: 00:00:14.636, CPU: 00:00:00.15, GC gen0: 5, gen1: 1, gen2: 0
- val links : (string * HyperLink) list array
- Real: 00:00:06.245, CPU: 00:00:00.31, GC gen0: 3, gen1: 0, gen2: 0
- val links : (string * HyperLink) list list
- Real: 00:00:15.45, CPU: 00:00:00.46, GC gen0: 5, gen1: 1, gen2: 0
- val links : (string * HyperLink) list array
- Real: 00:00:03.832, CPU: 00:00:00.62, GC gen0: 2, gen1: 1, gen2: 0
- So the serial execution took 14.5 to 15.5 seconds, while the parallel execution took 3.8 to 6.3 seconds.
复制代码
|
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?注册
x
|