I'm currently playing with random ideas I have for Nimitz (aka the next version of my NWMLS search tool), and one of the things I'm thinking about is displaying the Median value of properties that match an arbitrary search criteria. I already do something similar to this now on
Zearch (the current version of my NWMLS search tool). with MIN, MAX, and AVG. So I wrote a Stored Proc (using MS SQL 2005 features), and came up with the following. (FYI - I was inspired by
this blog post).
1: CREATE PROC [dbo].[GetMedian]
2: (
3: @Col varchar(100),
4: @Table varchar(100),
5: @SqlWhere varchar(1000)
6: )
7: AS
8: BEGIN
9:
10: DECLARE @Count AS INT, @MiddleRow AS INT, @Median AS INT
11: DECLARE @SqlAll nvarchar(MAX)
12:
13: SET @SqlAll = 'WITH MLSSTATS AS '
14: SET @SqlAll = @SqlAll + '('
15: SET @SqlAll = @SqlAll + 'SELECT ROW_NUMBER() OVER (ORDER BY ' + @Col
16: SET @SqlAll = @SqlAll + ' DESC ) AS RowNum , '
17: SET @SqlAll = @SqlAll + '(SELECT COUNT(*) FROM ' +@Table + ' '
18: SET @SqlAll = @SqlAll + @SqlWhere + ') As Cnt,'
19: SET @SqlAll = @SqlAll + @Col
20: SET @SqlAll = @SqlAll + ' FROM ' + @Table + ' ' + @SqlWhere
21: SET @SqlAll = @SqlAll + ') '
22: SET @SqlAll = @SqlAll + 'select AVG(' + @Col +') from MLSSTATS '
23: SET @SqlAll = @SqlAll + 'where rownum IN((Cnt + 1) / 2, (Cnt + 2) / 2)'
24:
25:
26: EXEC sp_executesql @SqlAll
27: END
So if, let's say, I wanted to find the median value of all the houses currently listed in the city of Redmond, I'd do something like this...
1: GetMedian 'ListPrice', 'MLSTABLE', 'WHERE City in (''Redmond'')'
And then the proc would do it's dynamic SQL black magic and end up executing this...
1: WITH MLSSTATS AS
2: (
3: SELECT ROW_NUMBER() OVER (ORDER BY ListPrice DESC ) AS RowNum ,
4: (SELECT COUNT(*) FROM MLSTABLE WHERE City in ('Redmond')) As Cnt,
5: ListPrice
6: FROM MLSTABLE
7: WHERE City in ('Redmond')
8: )
9: select AVG(ListPrice)
10: from MLSSTATS
11: where rownum IN((Cnt + 1) / 2, (Cnt + 2) / 2)
The SQL is ugly, but that's that nature of the beast. Anyway, the problem is that I really want do to this...
1: SELECT AVG(listprice), MEDIAN('ListPrice', 'MLSTABLE', 'WHERE City = ''Redmond''')
2: FROM MLSTABLE
3: WHERE City = 'Redmond'
Or better yet, try to implement Median as a user defined aggregate function in C# / CLR language so it's even easier to use like so...
1: SELECT AVG(listprice), MEDIAN(ListPrice)
2: FROM MLSTABLE
3: WHERE City = 'Redmond'
According the above blog post it can't be done in a CLR language (which I don't understand). I guess user defined aggregates written in a CLR language only get the values of a set one at a time, instead of the entire set (which you would obviously need to compute a median)? Either way, I'm wondering if there's a way to convert my ugly proc into an ugly function instead? I think what's tripping me up, is that I need to define a variable that's accessible from dynamic SQL and returnable from the function that calls it.
Print | posted on Tuesday, May 01, 2007 8:38 AM