Use Python To Create A '.Gif' From Any Video

Animated images are really popular on the web right now, especially on sites like Tumblr and Reddit. So now and then you may see something in a tv show or movie you are watching and want to create a gif from it. Using python and 'images2gif' this is very easy and this is how.

Requirements:

This isn't needed, but I am going to assume you are running on Windows.

The Plan

The plan is very simple. We are going to use FFMpeg to generate a series of screenshots from the video and then we are going resize them and stitch them together using "images2gif".

Setup the batch file to grab screenshots

FFMpeg is what we will be using to grab the screenshots and that can be called with a few command line instructions to get what we want out of it. 

ffmpeg -i "Video.wmv" -ss 200 -f image2 -vframes 100 "Images//Frame%%03d.png"
  • -i : The video file we want to use
  • -ss : The start time we want to begin.
  • -f : Output type.
  • -vframes: The number of frames we want to grab
  • The last argument is the output directory and file name. The %%03d will be replace by the frame number so we have a sequential list.

Have a little play with this to get a feel for how it works and then put it into a batch file.

Calling the batch file from Python

Starting with a blank python file you will want to add code like this:

import os

#Create the screengrabs
os.system("GrabScreenshots.bat ")

So that will write out the screenshots into a folder you have set in step one.

 Using the images in python

Once the images are created you will want to load them like this

from PIL import Image

#Get the file names
file_names = sorted((fn for fn in os.listdir('./Images') if fn.endswith('.png')))
print file_names

#Open the files
print 'Opening the images'
images = [Image.open('Images/' + fn) for fn in file_names]

Resizing the images

This can be done using standard PIL functions like so:

#Resize
baseSize = images[0].size
minheight = 192
height = baseSize[1]
scale = float(height)/float(minheight)
print "Scale: " + str(scale)

print "Old Size: " + str(baseSize)
baseSize = ( int(baseSize[0] /scale) , int(baseSize[1] /scale) ) 

print 'Resizing'
size = baseSize;
for im in images:
    im.thumbnail(size, Image.ANTIALIAS)

Writing out the .gif file

This is where we use the "images2gif" to create a gif from our resized images.

from images2gif import writeGif
from time import time

#Set to 24FPS
runningtime = 0.0416
print runningtime

print 'Saving'
filename = "Gif/Gif" 
writeGif(filename   + str(int(time())) + ".gif", images, duration=runningtime, dither=1, nq = 1)

I have set 0.0416 as the frame time as that is the time of one frame at 24fps. A lot of this could be controlled on input, or through a simple interface but that is a task for the reader.

This last bit of code will write out the .gif file for you to use. I would recommend playing with the settings and the size of the output image to find the desired quality and file size.

FINAL CODE

Here is all that code together with some input controls to make it easier to hook up and some cleanup at the end.

Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
from images2gif import writeGif
from PIL import Image
from time import time
import os
import sys

#Part 1 -------------------------------

#Create the screengrabs
BatFileString = "GS.bat"

#Set Filename
BatFileString += " AD5.wmv"

#Set Start frame
BatFileString += " 75"

#Set Desired Number of Frames
BatFileString += " 10"

print BatFileString
os.system(BatFileString)

# Part 2 -------------------------------

#Get the file names
file_names = sorted((fn for fn in os.listdir('./Images') if fn.endswith('.png')))
print file_names

#Open the files
print 'Opening the images'
images    = [Image.open('Images/' + fn) for fn in file_names]

#Resize
baseSize  = images[0].size
minheight = 192
height    = baseSize[1]
scale     = float(height)/float(minheight)
print "Scale: " + str(scale)

print "Old Size: " + str(baseSize)
baseSize  = ( int(baseSize[0] /scale) , int(baseSize[1] /scale) ) 

print 'Resizing'
size = baseSize;
for im in images:
	im.thumbnail(size, Image.ANTIALIAS)

#Part 3-------------------------------
	
#Set to 24FPS
runningtime = 0.0416
print runningtime
	
print 'Saving'
filename = "Gif/Gif" 
writeGif(filename + str(int(time())) + ".gif", images, duration=runningtime, dither=1, nq = 1)

#Clean Up -----------------------------

#Remove generated images
filelist = [ f for f in os.listdir('./Images') if f.endswith(".png") ]
for f in filelist:
    os.remove('./Images/' + f)

Bat File:

ffmpeg -i %1 -ss %2 -f image2 -vframes %3 "Images//outd%%03d.png"

 

And here is a link of it all together: DOWNLOAD.

Next time I look at python I will show you how this can be combined to make easily make a .gif from Youtube and other web video sites and automatically upload them to Twitter or Tumblr.

 

 

So the tool is coming along...

So the shader analysing tool is coming along. It is still a bit messy but the back-end is all in place. 

As soon as the next game is released it will be great to get some time to make the tool more presentable, useful and get some good PBR in there as default.

gettingthere

AMD IL Empty Instruction Counts

Just a note on the content of this article:
I am still learning about AMD IL so this could be caused by my own mistakes when getting the IL code!*

So, recently I have been looking at AMD ISA and AMD IL code generated from HLSL shaders. During my last poke around in the generated AMD IL code I noticed that the code is generated  in 64 instruction chunks so if your shader converts to 63 instructions then the AMD IL code will be 64 instructions long with the last instruction being " v_cndmask_b32  v0, s0, v0, vcc " but if you go over this and have a 65 instruction shader then the generated code will be 128 instructions long, effectively double the amount of instructions to gain one more.

Here is an example:

HLSL
float4 psMain(PS_INPUT input) : SV_TARGET
{
          float IC = 1.0f;
          IC += input.pos.x;
          IC += input.pos.y;
          IC += input.pos.z;
          IC += input.tex.x;
          IC += input.tex.y;
          IC += input.tex.z;

//Code that will tip us over the 64 instruction block.          
//IC += input.tex.w;

	return float4(IC,IC ,IC ,IC );
}
AMD IL
shader psMain

  v_cndmask_b32  v0, s9, v0, vcc                            // 00000000: 00000009
  v_cndmask_b32  v0, s0, v129, vcc                          // 00000004: 00010200
  v_cndmask_b32  v0, v93, v128, vcc                         // 00000008: 0001015D
  v_cndmask_b32  v64, exec_lo, v0, vcc                      // 0000000C: 0080007E
  v_cndmask_b32  v48, s0, v128, vcc                         // 00000010: 00610000
  v_cndmask_b32  v0, s21, v0, vcc                           // 00000014: 00000015
  v_cndmask_b32  v35, exec_lo, v0, vcc                      // 00000018: 0046007E
  v_cndmask_b32  v48, s1, v128, vcc                         // 0000001C: 00610001
  v_cndmask_b32  v0, s21, v0, vcc                           // 00000020: 00000015
  v_cndmask_b32  v3, s125, v0, vcc                          // 00000024: 0006007D
  v_cndmask_b32  v17, s0, v0, vcc                           // 00000028: 00220000
  v_cndmask_b32  v0, s3, v0, vcc                            // 0000002C: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000030: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 00000034: 00000001
  v_cndmask_b32  v48, s0, v128, vcc                         // 00000038: 00610000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000003C: 00000000
  v_cndmask_b32  v48, s0, v128, vcc                         // 00000040: 00610000
  v_cndmask_b32  v0, v17, v8, vcc                           // 00000044: 00001111
  v_cndmask_b32  v0, s3, v0, vcc                            // 00000048: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 0000004C: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 00000050: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000054: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000058: 00000000
  v_cndmask_b32  v48, s0, v128, vcc                         // 0000005C: 00610000
  v_cndmask_b32  v0, s34, v17, vcc                          // 00000060: 00002222
  v_cndmask_b32  v0, s3, v0, vcc                            // 00000064: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000068: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 0000006C: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000070: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000074: 00000000
  v_cndmask_b32  v48, s1, v128, vcc                         // 00000078: 00610001
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000007C: 00000000
  v_cndmask_b32  v0, s3, v0, vcc                            // 00000080: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000084: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 00000088: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 0000008C: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000090: 00000000
  v_cndmask_b32  v48, s1, v128, vcc                         // 00000094: 00610001
  v_cndmask_b32  v0, v17, v8, vcc                           // 00000098: 00001111
  v_cndmask_b32  v0, s3, v0, vcc                            // 0000009C: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 000000A0: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 000000A4: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 000000A8: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000000AC: 00000000
  v_cndmask_b32  v48, s1, v128, vcc                         // 000000B0: 00610001
  v_cndmask_b32  v0, s34, v17, vcc                          // 000000B4: 00002222
  v_cndmask_b32  v0, ttmp9, v0, vcc                         // 000000B8: 00000079
  v_cndmask_b32  v16, s0, v0, vcc                           // 000000BC: 00200000
  v_mac_f32     v192, s0, v0                                // 000000C0: 3F800000
  v_mac_f32     v192, s0, v0                                // 000000C4: 3F800000
  v_mac_f32     v192, s0, v0                                // 000000C8: 3F800000
  v_mac_f32     v192, s0, v0                                // 000000CC: 3F800000
  v_cndmask_b32  v0, s3, v0, vcc                            // 000000D0: 00000003
  v_cndmask_b32  v2, s0, v8, vcc                            // 000000D4: 00041000
  v_cndmask_b32  v34, s0, v0, vcc                           // 000000D8: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000000DC: 00000000
  v_cndmask_b32  v16, s0, v0, vcc                           // 000000E0: 00200000
  v_cndmask_b32  v0, s71, v0, vcc                           // 000000E4: 00000047
  v_cndmask_b32  v49, s0, v0, vcc                           // 000000E8: 00620000
  v_cndmask_b32  v0, s85, v0, vcc                           // 000000EC: 00000055
  v_cndmask_b32  v34, s0, v8, vcc                           // 000000F0: 00441000
  v_cndmask_b32  v0, s16, v25, vcc                          // 000000F4: 00003210
  v_cndmask_b32  v0, ttmp3, v0, vcc                         // 000000F8: 00000073
  v_cndmask_b32  v0, s40, v0, vcc                           // 000000FC: 00000028
end

Above we can see the generated code which fits exactly within the 64 instruction limit. And then if we uncomment the last addition this happens:

HLSL
float4 psMain(PS_INPUT input) : SV_TARGET
{
          float IC = 1.0f;
          IC += input.pos.x;
          IC += input.pos.y;
          IC += input.pos.z;
          IC += input.tex.x;
          IC += input.tex.y;
          IC += input.tex.z;

//Code that will tip us over the 64 instruction block.          
IC += input.tex.w;

	return float4(IC,IC ,IC ,IC );
}
AMD IL
shader psMain

  v_cndmask_b32  v0, s9, v0, vcc                            // 00000000: 00000009
  v_cndmask_b32  v0, s0, v129, vcc                          // 00000004: 00010200
  v_cndmask_b32  v0, v93, v128, vcc                         // 00000008: 0001015D
  v_cndmask_b32  v64, exec_lo, v0, vcc                      // 0000000C: 0080007E
  v_cndmask_b32  v48, s0, v128, vcc                         // 00000010: 00610000
  v_cndmask_b32  v0, s21, v0, vcc                           // 00000014: 00000015
  v_cndmask_b32  v35, exec_lo, v0, vcc                      // 00000018: 0046007E
  v_cndmask_b32  v16, s1, v128, vcc                         // 0000001C: 00210001
  v_cndmask_b32  v3, s125, v0, vcc                          // 00000020: 0006007D
  v_cndmask_b32  v17, s0, v0, vcc                           // 00000024: 00220000
  v_cndmask_b32  v0, s3, v0, vcc                            // 00000028: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 0000002C: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 00000030: 00000001
  v_cndmask_b32  v48, s0, v128, vcc                         // 00000034: 00610000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000038: 00000000
  v_cndmask_b32  v48, s0, v128, vcc                         // 0000003C: 00610000
  v_cndmask_b32  v0, v17, v8, vcc                           // 00000040: 00001111
  v_cndmask_b32  v0, s3, v0, vcc                            // 00000044: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000048: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 0000004C: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000050: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000054: 00000000
  v_cndmask_b32  v48, s0, v128, vcc                         // 00000058: 00610000
  v_cndmask_b32  v0, s34, v17, vcc                          // 0000005C: 00002222
  v_cndmask_b32  v0, s3, v0, vcc                            // 00000060: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000064: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 00000068: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 0000006C: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000070: 00000000
  v_cndmask_b32  v48, s1, v128, vcc                         // 00000074: 00610001
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000078: 00000000
  v_cndmask_b32  v0, s3, v0, vcc                            // 0000007C: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000080: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 00000084: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 00000088: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000008C: 00000000
  v_cndmask_b32  v48, s1, v128, vcc                         // 00000090: 00610001
  v_cndmask_b32  v0, v17, v8, vcc                           // 00000094: 00001111
  v_cndmask_b32  v0, s3, v0, vcc                            // 00000098: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 0000009C: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 000000A0: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 000000A4: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000000A8: 00000000
  v_cndmask_b32  v48, s1, v128, vcc                         // 000000AC: 00610001
  v_cndmask_b32  v0, s34, v17, vcc                          // 000000B0: 00002222
  v_cndmask_b32  v0, s3, v0, vcc                            // 000000B4: 00000003
  v_cndmask_b32  v34, s0, v0, vcc                           // 000000B8: 00440000
  v_cndmask_b32  v0, s1, v0, vcc                            // 000000BC: 00000001
  v_cndmask_b32  v34, s0, v0, vcc                           // 000000C0: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000000C4: 00000000
  v_cndmask_b32  v48, s1, v128, vcc                         // 000000C8: 00610001
  v_cndmask_b32  v0, v51, v25, vcc                          // 000000CC: 00003333
  v_cndmask_b32  v0, ttmp9, v0, vcc                         // 000000D0: 00000079
  v_cndmask_b32  v16, s0, v0, vcc                           // 000000D4: 00200000
  v_mac_f32     v192, s0, v0                                // 000000D8: 3F800000
  v_mac_f32     v192, s0, v0                                // 000000DC: 3F800000
  v_mac_f32     v192, s0, v0                                // 000000E0: 3F800000
  v_mac_f32     v192, s0, v0                                // 000000E4: 3F800000
  v_cndmask_b32  v0, s3, v0, vcc                            // 000000E8: 00000003
  v_cndmask_b32  v2, s0, v8, vcc                            // 000000EC: 00041000
  v_cndmask_b32  v34, s0, v0, vcc                           // 000000F0: 00440000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000000F4: 00000000
  v_cndmask_b32  v16, s0, v0, vcc                           // 000000F8: 00200000
  v_cndmask_b32  v0, s71, v0, vcc                           // 000000FC: 00000047
  v_cndmask_b32  v49, s0, v0, vcc                           // 00000100: 00620000
  v_cndmask_b32  v0, s85, v0, vcc                           // 00000104: 00000055
  v_cndmask_b32  v34, s0, v8, vcc                           // 00000108: 00441000
  v_cndmask_b32  v0, s16, v25, vcc                          // 0000010C: 00003210
  v_cndmask_b32  v0, ttmp3, v0, vcc                         // 00000110: 00000073
  v_cndmask_b32  v0, s40, v0, vcc                           // 00000114: 00000028
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000118: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000011C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000120: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000124: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000128: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000012C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000130: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000134: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000138: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000013C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000140: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000144: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000148: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000014C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000150: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000154: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000158: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000015C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000160: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000164: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000168: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000016C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000170: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000174: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000178: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000017C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000180: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000184: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000188: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000018C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000190: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000194: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 00000198: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 0000019C: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001A0: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001A4: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001A8: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001AC: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001B0: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001B4: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001B8: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001BC: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001C0: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001C4: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001C8: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001CC: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001D0: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001D4: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001D8: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001DC: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001E0: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001E4: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001E8: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001EC: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001F0: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001F4: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001F8: 00000000
  v_cndmask_b32  v0, s0, v0, vcc                            // 000001FC: 00000000
end

I am not sure how this is handled on the device, whether it runs the empty instructions or if that is just what is loaded into memory and when it executes it actually drops out early but it does look a little strange!

I am going to continue my research into how a shader actually runs on the device and will update this if I find out more!

 

Creating A Render Target In DirectX11

This is the code to create a render target which can be bound for writing and reading in a texture.

//D3D Objects To Create Into
ID3D11Texture2D* _Texture2D = NULL;
ID3D11RenderTargetView*	_RenderTargetView = NULL;
ID3D11ShaderResourceView* _ShaderResourceView = NULL;

//D3D Device
ID3D11Device* Device = _directX->GetDevice();

D3D11_TEXTURE2D_DESC bufferDesc;
bufferDesc.ArraySize = 1;
bufferDesc.BindFlags = D3D11_BIND_RENDER_TARGET | D3D11_BIND_SHADER_RESOURCE;
bufferDesc.CPUAccessFlags = 0;
bufferDesc.Format = format;
bufferDesc.Height = height;
bufferDesc.MipLevels = 1;
bufferDesc.MiscFlags = 0;
bufferDesc.SampleDesc = sampleDesc;
bufferDesc.Usage = D3D11_USAGE_DEFAULT;
bufferDesc.Width = width;
HRESULT hr = Device->CreateTexture2D(&bufferDesc, 0, &_Texture2D);

//Creating a view of the texture to be used when binding it as a render target
D3D11_RENDER_TARGET_VIEW_DESC renderTargetViewDesc;
renderTargetViewDesc.Format = format;
renderTargetViewDesc.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D;
renderTargetViewDesc.Texture2D.MipSlice = 0;
hr = Device->CreateRenderTargetView(_Texture2D, 0, &_RenderTargetView);
if (hr != S_OK)
{
	return false;
}

//Creating a view of the texture to be used when binding it on a shader to sample
D3D11_SHADER_RESOURCE_VIEW_DESC shaderResourceViewDesc;
shaderResourceViewDesc.Format = format;
shaderResourceViewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
shaderResourceViewDesc.Texture2D.MostDetailedMip = 0;
shaderResourceViewDesc.Texture2D.MipLevels = 1;
hr = Device->CreateShaderResourceView(_Texture2D, &shaderResourceViewDesc, &_ShaderResourceView);
if (hr != S_OK)
{
	return false;
}

Creating a Cubemap in DX11

I have noticed there isn't a lot of reference for this on the internet, so here is how to create a cubemap in DirectX11.

 // Load all individual images into CPU memory.
 LoadedImageData images[6];
 bool loadedImages = LoadCubeFaces(&images[0]);

 //Using DXGI_FORMAT_R8G8B8A8_UNORM (28) as an example.
 DXGI_FORMAT format = DXGI_FORMAT_R8G8B8A8_UNORM;

 //Ensure we have the images we want to put in the cube map.
 if (loadedImages)
 {
	 //D3DObjects to create
	 ID3D11Texture2D* cubeTexture = NULL;
	 ID3D11ShaderResourceView* shaderResourceView = NULL;

	 //Description of each face
	 D3D11_TEXTURE2D_DESC texDesc;
	 texDesc.Width = width;
	 texDesc.Height = height;
	 texDesc.MipLevels = 1;
	 texDesc.ArraySize = 6;
	 texDesc.Format = format;
	 texDesc.CPUAccessFlags = 0;
	 texDesc.SampleDesc.Count = 1;
	 texDesc.SampleDesc.Quality = 0;
	 texDesc.Usage = D3D11_USAGE_DEFAULT;
	 texDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
	 texDesc.CPUAccessFlags = 0;
	 texDesc.MiscFlags = D3D11_RESOURCE_MISC_TEXTURECUBE;

	 //The Shader Resource view description
	 D3D11_SHADER_RESOURCE_VIEW_DESC SMViewDesc;
	 SMViewDesc.Format = texDesc.Format;
	 SMViewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURECUBE;
	 SMViewDesc.TextureCube.MipLevels = texDesc.MipLevels;
	 SMViewDesc.TextureCube.MostDetailedMip = 0;

	 //Array to fill which we will use to point D3D at our loaded CPU images.
	 D3D11_SUBRESOURCE_DATA pData[6];
	 for (int cubeMapFaceIndex = 0; cubeMapFaceIndex < 6; cubeMapFaceIndex++)
	 {
		 //Pointer to the pixel data
		 pData[cubeMapFaceIndex].pSysMem = images[i].pixels; 
		 //Line width in bytes
		 pData[cubeMapFaceIndex].SysMemPitch = images[i].rowPitch; 
		 // This is only used for 3d textures.
		 pData[cubeMapFaceIndex].SysMemSlicePitch = 0;
	 }

	 //Create the Texture Resource
	 HRESULT hr = _directx->GetDevice()->CreateTexture2D(&texDesc, &pData[0], &cubeTexture);
	 if (hr != S_OK)
	 {
		 return false;
	 }

	 //If we have created the texture resource for the six faces 
	 //we create the Shader Resource View to use in our shaders.
	 hr = _directx->GetDevice()->CreateShaderResourceView(m_cubeTexture, &SMViewDesc, &shaderResourceView);
	 if (hr != S_OK)
	 {
		 return false;
	 }

 }

A look into: AMD ISA from HLSL

I thought after my last post of the Shader Analyser which output AMD ISA (Instruction Set Architecture) it was worth doing a little write up on what exactly that is and why it may be worth while to take the generated code into consideration when doing low level optimisation.

So to get started we will look at a simple example pixel shader. In all of these examples we will be using shader model 5.0 and will be building for Hawaii architecture.

struct PS_INPUT
{
    float4 pos : SV_POSITION;
    float4 tex : TEXCOORD0;
};


float4 psMain(PS_INPUT input) : SV_TARGET
{
	return float4(1,1,1,1);
}

Ok, so here we have our very basic Pixel Shader. It is taking an input structure from the Vertex Shader which is passing in a position but not using it and just writing out one into each channel.

So, we can look at this in three levels getting progressively lower: the ASM that is generated from DirectX, the AMD ISA and then the AMD IL (Input Language) which is the instructions actually passed in the GPU. So lets take a look at each of these:

DirectX ASM

ps_5_0
dcl_globalFlags refactoringAllowed
dcl_output o0.xyzw
mov o0.xyzw, l(1.000000,1.000000,1.000000,1.000000)
ret 

Here we can see that is declaring an output register (o0.xyzw) and then copying 1.0 into each channel at that address and returning. This is as straight forward as you can get. It makes no use of the position or texcoord data passed through to the shader as we haven't accessed them at all in the HLSL shader.

AMD ISA

shader psMain

  v_mov_b32     v0, 1.0                                     // 00000000: 7E0002F2
  v_cvt_pkrtz_f16_f32  v0, v0, v0                           // 00000004: 5E000100
  s_nop         0x0000                                      // 00000008: BF800000
  exp           mrt0, v0, v0, v0, v0 done compr vm          // 0000000C: F8001C0F 00000000
  s_endpgm                                                  // 00000014: BF810000
end

So now we are getting more complex looking, but don't worry it is still very simple when you break it down! Here is a link to the all instructions. So lets take a look at this line by line. 

Our first instruction is v_mov_b32 which the documentation says:

V_MOV_B32
Single operand move instruction. Allows denorms in and out, regardless of denorm mode, in
both single and double precision designs.

This means that instruction is just a move pretty much like the "mov" in the DirectX ASM. Where it is moving the value of 1.0 into the register v0.

Next we have the more intimidatingly named v_cvt_pkrtz_f16_f32, but not to worry again we will go to the documentation to see what this is:

v_cvt_pkrtz_f16_f32
Convert two float 32 numbers into a single register holding two packed 16-bit floats.

So in the first instruction we stored a 32 bit value of 1.0 into the register v0. Now we are going to store two 16 bit values in that same register. And the two 16 bit values are both going to be the vlaue we stored in v0 initially converted into 16 bit. So this gives us a register which is storing two 16 bit values of 1.0. This is a little strange, but things tend to get a little bit strange the further down you go as you start seeing things the compiler has done to make the code run more optimally for its hardware.

Our next instruction is "s_nop" if you have worked with assembly before this may be familiar it it means no operation. The description from the documentation:

s_nop
Do nothing. Repeat NOP 1..8 times based on SIMM16[2:0]. 0 = 1 time, 7 = 8 times.

Now this is even more odd than before you must be thinking. Why on earth would a shader want to waste an instruction doing nothing? Well, this calls for us to dig further into the documentation where we will find this little bit of information:

Must add an S_NOP between two consecutive S_SETREG to the
same register.

S_SETREG is an instruction to write data to an internal hardware register, so this could be telling us that the reason for this s_nop may be that the compiler is adding the required s_nop as the next instruction is going to write to the same register as the instruction above. However, in this case I believe the s_nop is there to pad this shader to be 4 instructions and has been placed before the export instead of after it as an optimisation.

The next instruction is "exp" which our documentation tells us is the export function for this shader program. This is where the shader writes to the render targets.  This line is a little more complex than the others so we can look at it one bit at a time.

The first bit to make sense of would be the words at the end: "done compr vm". These are each individual flags. The flag "done" is used to indicate that this is the last output to a render target from this program, "compr" is telling the GPU that this is 16bit per component rather than 32 bit and "vm" is saying that this is a valid mask for the wavefront and must be set at least once per pixel shader. I will go into wavefronts and what this means in move detail in a later post.

The next part of this line to take a look at is the "mrt0" this is telling the program to write into the first render target. This is specified in our HLSL shader where we set the output of psMain to write to SV_TARGET. 

The last part of the line is the repeat of the register "v0". This is telling the program which value to write into each channel. "v0" currently contains two channels, each with a 16-bit value of 1.0 in it. And due to the "compr" flag only the first component is read. 

Finally the last line of the AMD ISA is "s_endpgm" which is obviously the instruciton to the end the program but to be consistent here is the description from the documentation:

S_ENDPGM
End of program; terminate wavefront.

This is telling GPU to end this program and the wave wavefront. Pretty straight forward. 

So you can see that the ISA is just the lower level version of the the DirectX ASM. It is doing the same things but is a little bit more explicit about how and we are beginning to see the quirks of the GPU come through.

AMD IL

This will be covered in the next post!

MFC Quirks: Part 2

SCROLLING A DIALOG

Sometimes you will want a text Edit Control to automatically scroll when you add a string to it. Unfortunately there isnt any support in the default class for this. I have found the easiest way to do this is to use CEdit::SetSel to fake a user selection of the last character in the control. Here is an example of that being done using a CEdit control and a std::string:

SetDlgItemText(IDC_ERRORWINDOW, ErrorWindowText.c_str());

CEdit* errorWindow = static_cast<CEdit*>(GetDlgItem(IDC_ERRORWINDOW));
errorWindow->SetSel(ErrorWindowText.length() - 1, ErrorWindowText.length(), false);

CHANGING WORKING DIRECTORY WITH OPEN/SAVE DIALOG

When making calls to either GetSaveFileName or GetOpenFileName it will in the background change the working directory of your running program which can be a problem if you use any relative paths later. To get around this you can use GetCurrentDirectory to store the current working path and then after the call to get the save/open file name you can call SetCurrentDirectory with the stored path to restore it to actual working directory.

CCOMBOBOX CLEAR DOESN'T

CComboBox has a clear function but this only clears the textbox part of the combo box object. To clear the list of entries as well you will need to also call ResetContent.

MFC Quirks: Part 1

I have been working with MFC to move the shader analysis code so it can be ran with DirectX running in C++. To do this I reimplemented most of what I did in C# in MFC (which I have worked with in the past). 

Since it has been a while since I last worked with MFC I had forgotten a lot of the little quirks and since it seems it is used much less these days than it was 5 years ago trying to scour the internet for the answer to some of the more niggling problems has also became more difficult and some of the sites are simply out of date. Not to mention a few new problems Microsoft has introduced in Visual Studio 2013.

So here is a small overview of little problems and how to avoid them.

DRAW ORDER

The draw order in MFC is by default the order of the creation of the items of the dialog. This can become a pain once you have added many items to the dialog and want to have an older item drawn last.

Unfortunately there isnt an easy solution to this so here is the awkward way. 

Right-click on the dialogs ressource (.rc) file and select "View Code" this will show you the code for the creation of the dialog. That will looks something like this:

IDD_DXMFC_DIALOG DIALOGEX 0, 0, 1031, 475
STYLE DS_SETFONT | DS_MODALFRAME | DS_FIXEDSYS | WS_POPUP | WS_VISIBLE | WS_CAPTION | WS_SYSMENU
EXSTYLE WS_EX_APPWINDOW
CAPTION "DXMFC"
FONT 8, "MS Shell Dlg", 0, 0, 0x1
BEGIN
SCROLLBAR IDC_SCROLLBAR1,342,0,8,318,SBS_VERT
GROUPBOX"",IDC_STATIC,350,339,108,120
CONTROL "Error Level 0",IDC_ERRORLEVEL0,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,9,458,57,10
CONTROL "Error Level 2",IDC_ERRORLEVEL2,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,143,458,57,10
CONTROL "Error Level 1",IDC_ERRORLEVEL1,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,77,458,57,10
CONTROL "Error Level 3",IDC_ERRORLEVEL3,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,211,458,57,10
END

So if we want this to draw the "Error Level 3" CONTROL before the GROUPBOX we can simply place the CONTROL before it in the list like this:

IDD_DXMFC_DIALOG DIALOGEX 0, 0, 1031, 475
STYLE DS_SETFONT | DS_MODALFRAME | DS_FIXEDSYS | WS_POPUP | WS_VISIBLE | WS_CAPTION | WS_SYSMENU
EXSTYLE WS_EX_APPWINDOW
CAPTION "DXMFC"
FONT 8, "MS Shell Dlg", 0, 0, 0x1
BEGIN
SCROLLBAR IDC_SCROLLBAR1,342,0,8,318,SBS_VERT
CONTROL "Error Level 3",IDC_ERRORLEVEL3,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,211,458,57,10
GROUPBOX"",IDC_STATIC,350,339,108,120
CONTROL "Error Level 0",IDC_ERRORLEVEL0,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,9,458,57,10
CONTROL "Error Level 2",IDC_ERRORLEVEL2,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,143,458,57,10
CONTROL "Error Level 1",IDC_ERRORLEVEL1,"Button",BS_AUTOCHECKBOX | WS_TABSTOP,77,458,57,10
END

So now if you save and close that file then run a build you will see that they will now be drawing in the new order.

DIRECTX DIALOG ITEM DRAW ORDER

Now if you are creating a DirectX view in your code you may want it to draw at a different time than when you are creating the CView* derived object. This trick is very similar to the one above. First you create a custom draw MFC object and place it in the dialog where you want your DirectX view to be and set its place in the resource file so that will be called on to draw at the correct time. Now when you come to create your DirectX view give it the HWND of the new custom draw MFC object. All of your drawing will now be done within that box. I find this to be a lot more convenient than creating the object in code as it allows you to scale and position your DirectX view in the design, which is for me a lot more convenient than using code and counting pixels.

NON-UNICODE MFC SUPORT IN VISUAL STUDIO 2013

When developing any project I like to keep everything in either unicode or multibyte because I find mixing just leads to excess code that is prone to mistakes. This is especially true when working with Direct3D's D3DCompile funciton which takes a void* as input for the shader text but will fail without a proper error message if you pass in a unicode string. To avoid this I switched the project into multibyte everywhere. The only problem is that when I transferred over to Visual Studio 2013 this is no longer supported in the MFC libraries. To fix this you have to go to this MSDN page to get an update to enable the standard multibyte functions of MFC. 

NEW LINES IN MFC DIALOGS

A lot of MFC dialogs use the windows new line format of "\r\n" rather only "\n". You will also have to enable the 'Multiline' property on the text box you are using.

Updated Shader Analyser With AMD IL support

After receiving some good feedback from reddit I have added the ability to compare the AMD IL that is generated by the driver. Here is the updated download.

Showing the comparison between the generated DirectX ASM and AMD IL for PS_5_0

Showing the comparison between the generated DirectX ASM and AMD IL for PS_5_0

This is done by calling the AMD  program CodeXLAnalyser.exe to generate it. This .exe is in the same directory as "Shader Analyzer.exe" so if in the future AMD updates that you can just swap out the .exe. I have also added a control on the dialog to specify any AMD IL command line arguments which can be found at the AMD developer site here. However, the information there is either not complete or out of date so I have included a screenshot of the full output of the help command below.


Worf, Joey And Furniture From 1985

If you have watched a lot of Star Trek: The Next Generation in your life you may have noticed that Worf has a really bizarre taste in furniture. One of the stranger props is the strange chair that is just a pad with a lot of strange balls surrounding it.

Now what came as a surprise was seeing this chair again in an episode of Friends ("The One That Could Have Been: Part 2"). The scene opens with Joey looking smug and sitting on it while Rachel looks at all the bizarre things he has bought with his acting money.

I would be that smug in that chair too.

I would be that smug in that chair too.

My first thought when I saw it was that this must just be a prop sitting around the studio that they just grabbed as another oddity to fill Joey's flat with.  Especially since it was used in first season episode "Haven" of ST:TNG as alien space ship filler. Now, here comes the strange part, "Haven" was first aired in October of 1990 and "The One That Could Have Been: Part 2" aired February of 2000 that's a nearly ten year gap where this odd piece of furniture hasn't shown it's face. I suppose that isn't the strangest thing that could happen, I mean you see old sci-fi props shared between films all the time over longer time spans (although usually with a little more frequency) so I thought I would have little further dig and found out that ST:TNG filmed on the Paramount Studios stages 8 and 9 where as Friends was filmed at The Warner Bros. Sound Stage 24. So, I am not 100% on how props are shared between studios but I imagine that unless they share a single prop supplier these might be two separate chairs! 

So, a little more googling and I have learned that this type of chair was actually a designed in the 1980's by a Norwegian furniture designer named Peter Obsvik it is the "Garden" chair. Here is his page describing it from his site. The version on his site here looks like the original design and is what we have seen in ST:TNG and Friends. Oddly it seems the original use of this furniture was as flexible outdoor seating. The newer ones you can find on his site look like they are from a 2011 relaunch of the design which look less scifi but at least a little bit more comfortable.

I was hoping to end this article with a link to where to find the chair if you for some reason wanted one but it looks like they were only being sold by a Norwegian company called Rybo and there website looks to be broken at the moment. So instead I am going to leave you with a collection of these interesting looking chairs.

EDIT: I found that you can get simpler versions of this chair over at http://www.globeconcept.org/ not quite the same, but it is a start!

doc_285_6.jpg
The 1985 design.

The 1985 design.

More colourful here than the black one on TV

More colourful here than the black one on TV

Current relaunch design.

Current relaunch design.

And last of all its original Star Trek: The next Generation season one debut!

And last of all its original Star Trek: The next Generation season one debut!

Thanks for reading!

Programming Keyboard Shortcuts!

There are a handful of super useful keyboard shortcuts that I use all the time. Some of them are less known than others so I thought I would list them by software.

Visual Studio 2010/2012

Copy/Paste Stack: Ctrl + Shift + V

This brings up a context menu of the last 10 things you have copied. This is most helpful when refactoring code. You can combine this with '0-9' to select from the list or, oddly, 'a' to select the last entry in the list.

Extended Copy: (Ctrl + C) when nothing is highlighted

This will copy the entire line including line breaks. This helpful for quickly duplicating a line.

Build And Run Controls: F5, F7, F10

These ones are pretty obvious and you may already know. F5 is build and run, F7 is run and F10 is run but break on the first line of execution (always handy on new projects to quickly find the entry point).

Box Select: Alt + Click and Drag

This allows you to select a box of text on screen instead of line by line. Once selected you can copy that box or paste it while keeping its current formatting. This kind of selection also allows for writing to multiple lines at a time. It is the biggest time saver when renaming or changing lots of values in a list. If you havent tried it before give it a go it will change how you work for the better.

Go To Definition: F12

This will take you to the definition for a funciton or variable. Keep in mind in larger projects this can be very very slow.

Add/Remove Breakpoint: F9

This is just to speed things up when writing code to quickly add the breakpoints you are going to need.

Step Over/ Step In/ Step Out: F10/F11/Shift+F11

Faster for debugging than using the buttons.

Visual Assist X

Switch Between .cpp and .h: Alt + O

Much easier than switching through the solution window. When I code on a computer without Visual Assist X this is the feature I usually miss the most.

A Better Go To Definition: Alt-G

This is the same as the Visual Studio shortcut but much faster.

Find All References: Alt-Shift-F

This is the visual assist version of find all references so it is basically identical to the standard one in Visual Studio but again, much faster.

Notepad++

Duplicate Line: Ctrl+D

Notepad++ has most of the standard hot keys of a normal text editor with the bonus of this magnificent hotkey. Duplicates the line you are currently on or what ever you have selected at that time. Nice.

Create an Incremental List: Alt+Mouse to select a column. Then Atl+C

This brings up a prompt allowing you to fill that column with incremental numbers. Super handy for writing simple array values quickly.

 

 

 

 

 

 

Tools I Use For Render Programming

I thought it would be useful to list the tools that I use when working on rendering on games to help me be more productive, most are obvious but here we go:

  • Visual Studio 2012
    • My primary language when coding is C++ and I program mostly on Windows so Visual Studio is by a large margin the best tool for it. The added bonus is that with Pix losing support in later versions of windows there is now built in graphics debugging tools in Visual Studio which aren't perfect yet but are definitely useful if you have nothing else available and are handy next to the built in CPU profiling tools.
  • Intel VTune
    • Very powerful CPU profiling tool. It takes a while to get used to how to set it up to get the kind of information you want accurately but once you do it is incredible. Though not useful it does have the interesting ability to let you sort the lines of your source code alphabetically as well as by performance...
  • Intel GPA
    • I learned about this one only recently but have found it to be more useful and stable than nSight in a lot of situations though still not as robust. It is particularly good when you want to capture multiple frames and then analyse them separately which is slightly more of a pain to do in other tools.
  • nVidia nSight
    • This is a big one. It has everything you need, Good overlays and a good integration into Visual Studio. The upside of this one is that is has been around a while, has good support and is relatively easy to find help with if you encounter any troubles. The downside is that its only for nVidia cards.
  • Windows Pix
    • This is the standard graphics debugger if you are using windows which comes with the DirectX SDK. It is good and helpful but lacks some of the fancier features of newer software. It appears that this is beginning to get discontinued in favor of the new built in GPU debugging tools which started at VS2012
  • DirectX 11
    • This is a personal preference over OpenGL. I find that DirectX is a lot more standardised and has better support. Although I am looking forward to Mantle.
  • Visual Assist X
    • As far as tools to help you get work done goes this is the number one. It has a whole collection of tools to get you working faster from quick switching between header and cpp to a much better auto-complete than intellisense. When it comes to working in hlsl shaders or cuda kernels there are small registry changes you can make to enable C style highlighting and auto-completion. This isnt needed for for some standard hlsl extensions in VS2012 though, which is nice.
  • Notepad++
    • This is such a useful editor to have just sitting open for when you need to open random data file. It has a lot of options for highlighting and a few nice hotkeys I wish they would copy over into Visual Studio. I am especially taken with Ctrl+D to duplicate lines without overwriting what is in the clipboard.

 

  • HONOURABLE MENTION: Google Music & InoReader
    • Google Music is relatively cheap (cheaper than buying music outright anyway) and since google seems to know everything about everyone it is pretty good at recommending new music. This lets me buildup some decent playlists for the different paces of the day. Which in an open plan office where you sometimes need headphones to mask out noise it is a god send. Syncing with my phone is just an added plus.
    • InoReader is just a Google Reader replacement which I have taken to using during my longer builds or captures. It is handy for keeping up with the latest graphics blogs and news. Especially during the big conferences when all the early releases of the presentations and papers are being put out.

Shadertoy Embeded

I am doing a quick test to make sure that is is possible to embed Shadertoy GLSL examples into this page. 

This will greatly help when exploring ideas and writing about them. It will also be nice to not have to use WebGL more than I have to.


If you can see the audio test above then the rest of the GLSL examples on this site should work for you.