New Shader Contest
[Pixel Shader Contest]
Prizes include 3D STUDIO MAX.
The Development Blog of Tim Graupmann
New Shader Contest
[Pixel Shader Contest]
Prizes include 3D STUDIO MAX.
This is a helpful code snippet I found on MSDN about programming Win32 in C#.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/faq111700.asp
Can I use the Win32 API from a .NET Framework program?
Yes. Using platform invoke, .NET Framework programs can access native code libraries by means of static DLL entry points.
Here is an example of C# calling the Win32 MessageBox function:
using System;
using System.Runtime.InteropServices;
class MainApp
{
[DllImport("user32.dll", EntryPoint="MessageBox")]
public static extern int MessageBox(int hWnd, String strMessage, String strCaption, uint uiType);
public static void Main()
{
MessageBox( 0, "Hello, this is PInvoke in operation!", ".NET", 0 );
}
}
This new tutorial covers where to get a free mysql database and a free apache/mod perl web server. Along with how to create and add tables to your MySQL database. How to configure the Apache MySQL drivers/wrappers and lastly, a sample perl script to connect to your MySQL database. MySql Tutorial.
Some useful commands when installing a MySQL database on Win2k.
Commands successfully tested with Win2k and [MySQL 4.0]
Here is more excellent documentation on how to access MySQL from C# using the ODBC drivers. 7.5 Can I access MySQL from .NET environment using Connector/ODBC ?
Exploring MySQL in the Microsoft .NET Environment
ByteFX.Data – MySQL Native .NET Providers
Founds some documentation from the old VRANNIS project we worked on back in Spring of 1999. VRANNIS is a voice recognition system that works via prototyping users.
VRANNIS
Voice Recognition
Artificial Neural Network
Identification System (VRANNIS)
Â
Prepared for:
Dr. Michael Stiber
CSSIE-490, Neural Networks
U of Washington, Bothell
Â
Submitted by:
Â
Â
Botello, Drake R.
dbotello@u.washington.edu
Nguyen, Hoai P.
nguyener@u.washington.edu
Khoat, Do V.
Kdo@u.washington.edu
Graupmann, Timothy A.
tgraupma@u.washington.edu
Â
August 20, 1998
The Voice Recognition Artificial Neural Network (VRANNIS) presented herein is a “speaker identification” Artificial Neural Network (ANN). As such, VRANNIS®
should not be confused with “word recognition” neural networks, which are inherently far more complex and capable of commercial applications.
VRANNIS outputs the personal identity of a speaker’s voice provided that the speaker’s input vector is a member of the prototype vector population. Classification is achieved by comparing the “power spectrum” signature points belonging to the input vector, against the “power spectrum” signature points belonging to the prototype vector population. In these regards, VRANNIS functions as an artificially intelligent human ear. Capable of accurately identifying voices that it is associatively familiar with. Additional time and work would extend the functionality of VRANNIS to encompass reliable identification of non-prototype users.
VRANNIS utilizes a Pentium®
, IBM®
compatible PC, an external microphone and Visual C++®
/MATLAB®
programming tools. The neural network consists of a single perceptron. VRANNIS can classify its four prototype vectors with the following accuracy PV1=100%, PV2=76%, PV3=83% and PV4=100%.
Application Area
Automatic Speech Recognition (ASR) can be broken into three categories: speaker-dependent, speaker-independent and speaker-adaptive. VRAAN is emblematic of a speaker-dependent ASR, as it is trained to recognize one of four specific speakers. As such, VRANNIS has limited commercial appeal beyond that of a toy.
However, achieving consistent and accurate results with in the scope of VRANNIS’s application required a relatively high level understanding of human speech characteristics, signal processing and analysis techniques, principal component analysis, Visual C++®
and MATLAB®
programming ability and neural network architecture knowledge inter alia.
The goal or problem space of VRANNIS entailed developing a neural network to mimic the functionality of the human ear; and approximate the performance of the human ear in regards to reliably identifying the owner of a voice. A human ear can easily identify a familiar or unfamiliar voice by the process of auditory association; which in essence is a biological ASR neural network. The human ear is also capable of reliably identifying a given speaker’s voice in “delta” situations including: rates of speech, manner of speech and background noise. VRANNIS’s accuracy suffers under similar situations.
Â
User Examples, Exercises and Functionality
VRANNIS program files, located in the accompanying diskette, must first be unzipped and loaded into a “temporary VRANNIS program directory” in MATLAB®
– before proceeding with any of the user examples or exercises. The authors highly recommend scanning the diskette with an anti-virus program utility capable of scanning “unopened” .zip files.
A user can demonstrate VRANNIS’s functionality and efficiency in one of three ways:
a. EDU>> present
b. Strike any key to scroll forward through the entire presentation
a. variable=load_sample(‘drake1.wav’);
b. Substitute ‘drake(1:29).wav’ to load additional ‘drake prototype vectors
c. Substitute ‘(hoai, khoat, or tim)1.wav’ to load additional member prototype vectors
Â
VRANNIS Signal Processing and Design Criteria
Initial Signal Processing
A voiceprint was digitally recorded twenty-nine individual times for each of the four individuals belonging to the prototype population (P11:29Â…P41:29). Subsequently, each voiceprint was converted from amplitude over time – to power (p) over frequency by application of the Fourier Transform.
Following, the power spectrum (total number of p-values) belonging to P11:29Â…P41:29 was reduced from 4,000 p-values to 500 p-values, by application of the fft function (See Figure 1, page 9).
Signal processing achieved three (3) important VRANNIS milestones as follows:
Principal Signal Component Processing
Principal voiceprints P11:29Â…P41:29 varied dramatically from one another – as one would expect. Unexpectedly, the individual voiceprints belonging to P11:29, P21:29, P31:29, and P41:29 also varied substantially amongst themselves. Thus the p-values in each of 1:29 voiceprints belonging to P1Â…P4 required “normalization” before being construction into reliable prototype vectors for (PV1Â…PV4) – representative of each 1:29 voice prints belonging to P1Â…P4.
Normalization was achieved by dividing each p-spectrum spanning 500 p-values for PV1Â…PV4 into fifty discrete sampling windows. Each sampling window contains ten individually indexed p-values. A hypothetical sampling window is shown in Figure 2 below, containing ten indexed p-values:
Â
Algorithms search each sampling window (50×29) belonging to each voiceprint for P1, P2, P3 and P4 and returns three different values for each sampling window. The mean of each p-value, the mode of the maximum p-value and then calculates the mean of the (mode + the maximum indexed p-value).
Referring to Figure 3, The mean represents the values of all ten p-values shown as vertical lines. The vertical line intersecting the bold horizontal line represents the mode of the maximum indexed p-value. Thus the mode of the max indexed p-value is identified, the mean of the max indexed p-value is computed to a single value; and the mean of the mode and the mean is computed as the final value for a single sampling window. The equation is written as:
mean{max(p-value index), mode(max(p-value index))}. (4.0)
Principal signal component processing achieved two (2) additional design milestones as follows:
Â
Interim Vector Component Processing
As an input vector passes through each of the signal processing stages discussed above it’s p-indexed wave form navigates through each of the indexed p-values belonging to PV1, PV2, PV3 and PV4. This dynamic relationship is illustrated in Figure 4, page 10.
Following, an algorithm measures the distance from each indexed p-value belonging to the input vector to each of the indexed p-values belonging to PV1, PV2, PV3 and PV4 . The algorithm returns the prototype vector (PV1, PV2, PV3 or PV4) that is closest in distance to the input vector at each of the fifty p-indexed sampling windows (note that the absolute value compensates for negative distances). In these regards, this vector component processing methodology borrows (partially) from the “distance logic” found in Hamming neural networks.
The algorithm counts the number of occurrences (or hits) that are coincident with the input vector and PV1, PV2, PV3 or PV4. A hit is regarded as a 1 where as a miss is regarded as a -1, which gives rise to four vectors each containing fifty elements. This information is input into a single layer perceptron for final classification.
Â
Â
VRANNIS Test Results
Â
NAME |
Drake 1:29 |
Hoai 1:29 |
Khoat 1:29 — 3 |
Tim 1:29 |
ACCURACY |
100% |
76% |
83% |
100% |
Alternative Approaches
VRANNIS could be implemented using a Hamming neural network, which would conceivably make VRANNIS substantially more powerful. Particularly for identifying both prototype and non prototype users. The addition of low pass filter is viewed as secondary to furthering the accuracy of a Hamming neural network. That is, we would first like to develop a Hamming network and test the results before implementing filtering techniques.
Â
Limitations
VRANNIS suffers from the inability to identify non-prototype users as noted above. In addition, VRANNIS’s classification reliability suffers variably if the sampling session:
We have not isolated the cause and affect from the foregoing variable circumstances.
If afforded additional time, we believe that energies devoted to replacing VRANNIS’s architecture with a Hamming neural network would most likely eliminate or mitigate the apparent “sensitivities” associated with the current version of VRANNIS.
Â
Problems Encountered
As laypersons, new to neural networks and voice signal processing, we encountered several challenges both individually and collectively. However, we regard these instances as applied learning opportunities – which is why we elected to develop a project rather than write a research paper. The new skills and awareness we have “earned” encompasses abilities that we did not fully possess before embarking.
Setting philosophical meandering aside we faced the following problems:
Thus we employed matrix and bias values that returned any of the four prototype vectors. However, we had to develop a technique to calculate the distances from the input vector and the four prototype vectors in such a way that -1 or 1 would be returned. The “distance technique” we employed approximates the logic of Hamming, which is one of the reasons we “believe” a Hamming network is ideal for VRANNIS.
Curiously, VRANNIS had little difficulty identifying the voice of anyone of us!
With respect to the “problems” listed herein, a few of us would like to meet with you late next week (or the next) to discuss any insights you may wish to share.
Â
Future Work
While we seemingly have placed the Hamming Net on a pedestal, there are several avenues within the context of VRANNIS, as it presently exists, that seem worthy of investigating. For example:
In essence, we would first attempt to discover if the accuracy of VRANNIS could be improved for externally voiced utterances – by simply increasing the number of recorded voice wave samples. Secondarily, we would increase the p-spectrum field. Finally, we would greatly increase the number of indexed p-value sampling windows.
All test results would be recorded, analyzed and documented. In this regard, “future work” constitutes a “research project” that Tim Graupman, Drake Botello and possibly others would be willing to undertake in the future. Either personally, or under a directed study.
Having completed all of the proposed research, we would next focus our efforts on developing a Hamming Neural Network for VRANNIS. Test results would be compared to each of the other successive methodologies.
Â
Â
Â
Â
Â
Â
Â
Â
Â
Â
References
Â
Â
Hagen, Martin T., and Howard B. Demuth, and Mark Beale, Neural Network Design (Boston, MA: PWS Publishing Company, 1996). 3.12-3.13, 4.13-4.20, 11.1-11.23
Â
Hanselman, Duane, and Bruce Littlefield, The Student Edition of MATLAB, Version 5 User’s Guide (Upper Saddle River, NJ: Prentece-Hall, Inc., 1997). n.p
Â
Hanselman, Duane, and Bruce Littlefield, The Student Edition of MATLAB, Version 5 User’s Guide, Online. Internet. Available: http://www2.rrz.une/themen/cmp.cal-tech.edu/matlab
, n.p. 10 July 1998.
Â
Looney, Carl G., Pattern Recognition Using Neural Networks: Theory and Algorithms for Engineers and Scientists (New York, NY: Oxford Press Inc., 1997). 80-81, 434-439.
Â
Carpenter, Gail A., and Stephen Grossberg, Pattern Recognition by Self-Organizing Neural Networks: (Cambridge, MA and London, England: The MIT Press 1991). 458.
Danset, Paul T., Speech Recognition Using Neural Networks: Master’s Thesis (Seattle, WA: University of Washington., 1993). 4.
Lea, et al, Trends in Speech Recognition (Englewood Cliffs, NJ: Prentice-Hall Inc., 1980). 10, 40-43, 108.
Dowla, Farid U, and Leah L. Rogers, Solving Problems in Environmental Engineering and Geosciences (Cambridge, MA: The MIT Press 1991). 104.
Â
Â
Â
Â
Â
Â
Â
Â
Appendix
Â
VRANNIS test resultsÂ…Â…Â…Â…Â…Â…Â…Â…Â…Â…Â…Â…Â…Â…Â…Â…..Page 11-14
Â
Â
Â
Â
Â
Â
The next 72 Hour Game Development Competition is coming June 25, 2004. Keep checking the [GDC Board] for updates. Get prepared!
[RULES]
Theme suggestions can be given immediately. On June 11, when I announce it, stage 1 voting will begin. On June 18, stage 2 voting will begin. On June 25, at exactly 12:01 PM EST (that’s right after noon), the results of the voting will be announced and the competition will begin. On June 28, at exactly 12:01 PM EST, the competition will be over and no more entries will be accepted.
This excellent example illustrates how [AVL Trees] (Optimized Binary Search Trees) function.
More Info:
[AVL Trees] — step by step information with source code.
Judges have been selected and the topic has been chosen. This time contestants will be making sword-based combat games! Some part of the game must include…
Visit [GameDev] for more information.
Judges have been selected and the topic has been chosen. This time contestants will be making sword-based combat games! Some part of the game must include using a sword to engage in combat. While it may sound a little limiting in the creativity department, we encourage the contestants to push it as far as it will go. We’re all looking forward to a very fun competition. Good luck!
Visit [GameDev] for more information.
Here is more great Art from Erik:
Erik E: Here’s a sketch I did for the frog warriors you fight when shrunk down by a
magic spell (or was it the red pill?).
Not sure about the armor. Looks like he got into a fight with zamfir.
(C) Erik Elverskog
Alright! [The Halo Kitty II] – playable demo is out! Check it out!
Conversation from an AIM message between me and Flax.
tgraupmann648: I had a wicked idea before I fell into deep sleep last night
Flax0000: Oh yer?
Flax0000: What was it?
tgraupmann648: I was thinking about connecting the model editor to a mysql database
tgraupmann648: I have this concept of a virtual workspace
tgraupmann648: So a person has complete control over their workspace
tgraupmann648: But you can flip through other users using the tool to watch what they are making
tgraupmann648: Similar to Linux how you can flip the TTY or workspaces
tgraupmann648: Eventually it could evolve into collaborative modeling
Flax0000: cool
tgraupmann648: On the technical side, I would just need to convert from C++ to C# and add a web reference
tgraupmann648: The web reference could connect to my site using web services that talk to a mysql database
tgraupmann648: I’ll save the idea, short term I’ve added the ability to zoom by scrolling with the mouse
TagML is a new 3d model editor / animator. The [source] and [build] are publically available. The latest feature adds the ability to lock an arbitrary axis. As requested, this will give you better precision while animating models. You can also lock by the strafe vector (side to side) and up vector (up and down) relative to the viewports. Or you can lock by axis. Oh, and you can zoom by scrolling the mouse.
I purchased a license for ZBrush 2.0. It’s just like they say, digital putty. I’m going through every tutorial and that doesn’t even scratch the surface of what this tool can do. Here is my first attempt at digital sculpting using the Sphere3D method. Like so many other posts at ZBrush Central… “My First Head”…
TagML is a new 3d model editor / animator. I have made the [source] and [build] publically available. The new feature that I just added is a better keyframe display bar. Hopefully this looks like a hybrid style of PhotoShop and Flash. If it’s not intuitive let me know. As always, suggestions are welcome.
Bones (C) The Game Creators Ltd.
I recently created a panel base object which the toolbars inherit to allow the use of textures in the toolbar buttons themselves. I moved the keyframe track display into its own panel. I still have the intent that the keyframe display will also be textured. I just finished the code that will allow the animation bar buttons to use textures. And I spent a little time drawing the icons to match the shaded background.
Android (C) The Game Creators Ltd.
Here is a hint at some of the upcoming art that you’ll find at FellStorm Software.
[Full Size]Â
(C) Erik Elverskog
The demos are getting cooler. For Linae’s screen/game writing class for a group project she created a [fan page / mock mission] for the online MMORPG PristonTale.
(C) Priston & Linae M. Graupmann
A few more conceptual sketches came from re-reading the enemy spec. Here’s some attempts at the {wolf-style, raven, hybrid spider-frog-piranha monster}.
(C) FellStorm Software
Linae’s flash class had a group assignment (team of 3) tasked with designing a mock demo of a game given the skills they picked up in the first few weeks of her Interactive Authoring course. [Halo Kitty] is quite entertaining.
(C) Linae M. Graupmann
When you start ASPX and ASMX programming, the default IIS setings are commonly messed up. The solution is not well documented. The configuration process requires this command in order to work properly.
This properly configures IIS: aspnet_regiis.exe -i aspnet_regiis can normally be found: C:/WINDOWS/Microsoft.NET/Framework/v1.1.4322 These are common commands for IIS: iisreset /start iisreset /status iisreset /stop
Linae completed an [edutainment flash demo] as her personal flash demo assignment for her Macromedia Flash class that I’m sure everybody can enjoy.
(C) Linae M. Graupmann
You fans out there have been inquiring what I’ve been doing the last couple months since my last post. I’ve been quite busy. During the daylight hours, I’ve been testing the next generation bidding system at INSP. And in my night hours, I’ve joined FellStorm Software, a game development group, working on an action RPG. As you can see glimpsing at the pic to the left, I’ve designed a model editor from scratch, capable of importing 3ds files or creating animated models. Basically it’s an animation tool which saves models in a new Tagml format. The tool is unique in that it will take models straight from 3d studio. The internal structure is all vector based, so it’s portable to OpenGL or DirectX.
Alien Hivebrain (C) The Game Creators Ltd.