Monday, December 01, 2008

Questions when during SA interview

General Unix questions

Here are some quick fire questions that I’ve made use of during the interview process. These are intended to get an idea of their basic knowledge. It can be a bit puzzling that whilst a candidate may know all about the internals of a storage enclosure, they don’t know how to locate a file with a particular name without using grep.

Answers aren’t always given because if you’re conducting an interview on this subject, you should probably know most of the answers

  • What scripting languages do you know? I’d hope for at least one of ksh, bash or csh (roughly in that order: I despise csh mind). Ideally perl will be known to them (see later)
  • What are setuid/setgid in relation to file permissions?
  • What are setuid/setgid in relation to directory permissions?
  • What’s a run level? How do Linux and Solaris Run levels differ? Solaris: What’s /etc/pathtoinst?
  • What does mknod do? What’s a named pipe?
  • How would you shut down a Sun system very quickly? (uadmin) How does it work? (Doesn’t go through run levels)
  • What is an inode?
  • What is a directory?
  • What does init do? What does inetd do?
  • What performance monitoring tools do you use?
  • What’s PGP/GPG; Public/Private Key; Setting up trust
  • What’s ssh? Setting up trust between accounts.
  • X Windowing environment. How’s it differ from others? Setting up X on Windows/Mac’s?
  • What’s VNC?
  • Any database experience?

General questions

Whilst I like to ask open questions that should give plenty of opportunity for the candidate to answer fully, it’s still important to give the candidate a chance to sell themselves. Try some open ended questions such as:

  • What project are you most proud of?
  • What have you achieved which demonstrated overcoming adversity in some form?
  • Where have you learnt the most that improves your skills?
  • Why are you leaving your current job?
  • Why do you want this job?
  • Why should we hire you?
  • Have you heard anything that concerns you?

Some to avoid:

  • Personally, I think there is very very little merit in asking about the minutiae of commands such as sed, grep, awk, sort and other associated Unix command utilities. This is especially so when it comes to command line switches to various commands. I take the view that, so long as somebody knows what the command does, and can give some context, there’s little to be gained by knowing if the candidate knows the “-w” flag in grep searches for whole words. The man pages are there for a purpose, and I’d expect them to use them for such insignificant parts of overall knowledge.

And of course, more personal questions:

  • What do you get up to when you’re not working?
  • I notice you’re from - why did you leave? (I interviewed a lot of antipodes in London, and always like to hear their reasons for travelling, and so on).
  • I see you’re interested in - is it fun? (There’s almost always something of interest to talk about that’s a long way away from technology. Try to find it!)

Don’t forget some obvious ones too, because agencies are often too enthusiastic to e-mail cv’s without checking them:

  • Are you able to work here for the duration of the position?
  • What visa are you on? (Some foreign nationals may have restrictive work visas)
  • Can you provide me with references?

and last, but far from least:

  • Do you have any questions for us?

… It’s a two way process, and you should always give them the opportunity to ask. Don’t assume previous or subsequent interviewers will, but if you’re one of six or seven interviews give them some slack if they’ve asked them all already ;-)

Don’t forget to thank them for coming along too.


http://www.leyton.org/2005/05/20/interviewing-for-unix-system-administrators/

http://www.rogerdarlington.co.uk/Interview.html

About the inode

An inode is a data structure in UNIX operating systems that contains important information pertaining to files within a file system. When a file system is created in UNIX, a set amount of inodes is created, as well. Usually, about 1 percent of the total file system disk space is allocated to the inode table.

Sometimes, people interchange the terms inode and inumber. The terms are similar and do correspond to each other, but they don't refer to the same things. Inode refers to the data structure; the inumber is actually the identification number of the inode—hence the term inode number, or inumber. The inumber is only one important item of information for a file. Some of the other attributes in an inode are discussed in the next section.

The inode table contains a listing of all inode numbers for the respective file system. When users search for or access a file, the UNIX system searches through the inode table for the correct inode number. When the inode number is found, the command in question can access the inode and make the appropriate changes if applicable.

Take, for example, editing a file with vi. When you type vi , the inode number is found in the inode table, allowing you to open the inode. Some attributes are changed during the edit session of vi, and when you have finished and typed :wq, the inode is closed and released. This way, if two users were to try to edit the same file, the inode would already have been assigned to another user ID (UID) in the edit session, and the second editor would have to wait for the inode to be released.

The inode structure

The inode structure is relatively straightforward for seasoned UNIX developers or administrators, but there may still be some surprising information you don't already know about the insides of the inode. The following definitions provide just some of the important information contained in the inode that UNIX users employ constantly:

* Inode number
* Mode information to discern file type and also for the stat C function
* Number of links to the file
* UID of the owner
* Group ID (GID) of the owner
* Size of the file
* Actual number of blocks that the file uses
* Time last modified
* Time last accessed
* Time last changed


http://www.ibm.com/developerworks/aix/library/au-speakingunix14/index.html?ca=drs-

Get information of inode on FreeBSD

> stat -x /bin/sh
File: "/bin/sh"
Size: 112288 FileType: Regular File
Mode: (0555/-r-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ wheel)
Device: 0,91 Inode: 49057 Links: 1
Access: Fri Nov 28 17:50:03 2008
Modify: Tue Jun 3 16:13:27 2008
Change: Tue Jun 3 16:13:27 2008


# fsdb -r /dev/ad14s1a
(...)
fsdb (inum: 2)> inode 49057
current inode: regular file
I=49057 MODE=100555 SIZE=112288
BTIME=Jun 3 16:13:27 2008 [0 nsec]
MTIME=Jun 3 16:13:27 2008 [0 nsec]
CTIME=Jun 3 16:13:27 2008 [0 nsec]
ATIME=Nov 28 17:50:03 2008 [0 nsec]
OWNER=root GRP=wheel LINKCNT=1 FLAGS=0 BLKCNT=dc GEN=2d96bb57
fsdb (inum: 49057)> blocks
Blocks for inode 49057:
Direct blocks:
194536, 194544, 194552, 194560, 194568, 194576, 194632 (7 frags)
fsdb (inum: 49057)>





> find -x / -inum 49057 -exec stat -x {} \;
File: "/bin/sh"
Size: 112288 FileType: Regular File
Mode: (0555/-r-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ wheel)
Device: 0,91 Inode: 49057 Links: 1
Access: Fri Nov 28 18:05:57 2008
Modify: Tue Jun 3 16:13:27 2008
Change: Tue Jun 3 16:13:27 2008

http://forums.freebsd.org/showthread.php?t=628

Tuesday, September 16, 2008

How to create and sign Certificates with OpenSSL

http://lists.apple.com/archives/Java-dev/2001/Jul/msg00769.html

How to generate and sign Certificates with OpenSSL. With this
technique, I can also create a Certificate Chain. The example steps
are:
1. Create a Root Private Key:
openssl genrsa -rand randomfile -out root.key 1024
a). genrsa - generate an RSA key.
b). -rand randomfile The OSX does not have a /dev/random that
will generate
a psedo-random number for it. Accordingly, we must provide a file with
a seed. My first tests use an old core file. I could create a Java
program to use SecureRandom to create a seed file or a set of
seed
files. Note that you can get entropy running under OSX as well.
c). -out root.key - The file to contain the private key.
d). 1024 - The length of the key in bits.
2. Create a Root Certificate for the Private Key and thus a Public Key as well:
openssl req -new -key root.key -x509 -out root.crt
a). req PCKS#10 certificate request.
b). -new - a new Certificate request. The user will be prompted for
relevant field values.
c). -key root.key - The private key file to be used for generating the
Certificate.
d). -x509 - output a self-signed certificate instead of a Certificate
request.
e). -out root.crt - The output file name.
3. To continue the process and create a Certificate chain, go through
the same steps (1 and 2)
to create a Private Key/Certificate pair for the to-be-signed
Certificate. The output files,
in this case, are:
a). one.key
b). one.crt - note that this is a self-signed Certificate. This
must be the
case for the next steps.
4. Convert the Certificate generated in step 3 to a Certificate request:
openssl x509 -x509toreq -in one.crt -out one.req -signkey one.key
a). x509 - Certificate signing utility.
b). -x509toreq - Convert a Certificate to a Certificate Request.
c). -in one.crt - The input Certificate (generated in step 3).
d). -out one.req - The output file which is a Certificate Request.
e). -signkey one.key - The Private Key used to sign the request. In this
case, it is the Private Key generated in step 3. Supposedly, this will
convert the input to a self-signed Certificate which might make part of
step 3 redundant. On the other hand, the redundancy may be necessary to
complete all the steps.
5. Sign the Certificate generated in Step 3 with the Root key generated in step
1:
openssl x509 -req -in one.req -CA root.crt -CAkey root.key -CAcreateserial
-out signed1.crt
a). x509 - Certificate signing utility.
b). -req - A Certificate Request is the expected input. Note, by default, a
Certificate is the expected input; this option overrides that
expectation.
c). -in one.req - The input Certificate Request.
d). -CA root.crt - Part the the Certificate Authority (CA) options. This
specified the CA Certificate to be used for the signing.
e). -CAkey root.key - Part of the CA options. This specifies the
Private Key
to be used in signing the Certificate Request (one.req).
f). -CAcreateserial - Part of the CA options. This specifies that a serial
number file is to be created, if it does not exist. this file contains
the serial number "02" and the Certificate being signed will have "1" as
its serial number. This file is named after the prefix of the -CA file;
in this case the output file is root.srl. Note: Future signing
using this
Root Certificate should use the
-CA serial root.srl
so that the serial number may be incremented.
g). -out signed1.crt - The name of the output Certificate that is signed by
the Root Certificate.
6. Continue the process as long as desired signing each new
Certificate with the previously
generated Private key to create a longer Certificate chain.

Thursday, August 21, 2008

佛心笑语

第一笑:说有一个躲在深山里的修行者,他每天清心寡欲,打禅悟道。
有一天,有一个放羊的女孩走过他身边,从此他动了凡心,喜欢上了那个女孩。这时佛祖出现了,他告戒修行者:一定要专心修炼,等到他修成正果,就能无欲而刚,就能没有烦恼了。
于是修行者专心修炼……一直过了500年,终于修成正果。
这时佛祖又出现了,笑着问修行者:怎么样了,没有欲望就没有烦恼了吧?
修行者答:我想见那个女孩!
第二笑:说师徒二人去化缘,来到一个河边,河水上涨,有个妇人在旁边很着急,央求师傅把他抱过河去,她家中有万分紧急的事情,师傅就抱着妇人过了河。
过河以后,徒弟老是跟着师傅问:“我们不能近女色,师傅怎能抱妇人过河呢?”师傅不答,小和尚就追着不厌其烦的问,终于,师傅转过身来,回答小和尚说:“我都放下了,你怎么还抱着呢!

FTP 数字代码的意义


110 重新启动标记应答。
120 服务在多久时间内ready。
125 数据链路埠开启,准备传送。
150 文件状态正常,开启数据连接端口。
200 命令执行成功。
202 命令执行失败。
211 系统状态或是系统求助响应。
212 目录的状态。
213 文件的状态。
214 求助的讯息。
215 名称系统类型。
220 新的联机服务ready。
221 服务的控制连接埠关闭,可以注销。
225 数据连结开启,但无传输动作。
226 关闭数据连接端口,请求的文件操作成功。
227 进入passive mode。
230 使用者登入。
250 请求的文件操作完成。
257 显示目前的路径名称。
331 用户名称正确,需要密码。
332 登入时需要账号信息。
350 请求的操作需要进一部的命令。
421 无法提供服务,关闭控制连结。
425 无法开启数据链路。
426 关闭联机,终止传输。
450 请求的操作未执行。
451 命令终止:有本地的错误。
452 未执行命令:磁盘空间不足。
500 格式错误,无法识别命令。
501 参数语法错误。
502 命令执行失败。
503 命令顺序错误。
504 命令所接的参数不正确。
530 未登入。
532 储存文件需要账户登入。
550 未执行请求的操作。
551 请求的命令终止,类型未知。
552 请求的文件终止,储存位溢出。
553 未执行请求的的命令,名称不正确。

隐藏SMTP旗标(Sendmail/Qmail/Postfix/Exim)

Sendmail以外的其他邮件服务器,因为Sendmail以root运行,比较容易使黑客提升为root。Sendmail寻找sendmail的配置
文件并编辑:Code:locate sendmail.cfnano -w /path/to/sendmail.cf[Ctrl+A Select All]
修改以下代码的粗体部分:Code:SmtpGreetingMessage=$j Sendmail $v/$Z; $b
[Ctrl+A Select All]重启Sendmail或重载其配置文件(killall -HUP sendmail)。Qmail
修改 qmail-smtpd 的 smtpgreeting 值,例如:Code:mail.example.com Greetings here
[Ctrl+A Select All]Postfix寻找Postfix的配置文件并编辑:Code:locate
main.cfnano -w /path/to/main.cf[Ctrl+A Select All]修改以下代码的粗体部分:Code:
smtpd_banner = $myhostname ESMTP $mail_name ($mail_version)[Ctrl+A Select All]
重起Postfix。EximExim与Postfix类似,修改 /etc/exim.conf 中的 smtpd_banner 变量

Raid restore

use raidtools to restore failed array

command: raidhotremove, raidhotadd

situation: one of array is failed (for example: /dev/sdb3)

#cat /proc/mdstat
md1 : active raid1 sdb2[1] sda2[0]
20482752 blocks [2/2] [UU]
md2 : active raid1 sdb3[1][F] sda3[0]
20482752 blocks [2/2] [U_]
md3 : active raid1 sdb5[1] sda5[0]
10241280 blocks [2/2] [UU]
md4 : active raid1 sdb6[1] sda6[0]
10241280 blocks [2/2] [UU]
md5 : active raid1 sdb7[1] sda7[0]
5116544 blocks [2/2] [UU]
md0 : active raid1 sdb8[1] sda8[0]
2096384 blocks [2/2] [UU]
md6 : active raid1 sdb9[1] sda9[0]
2048192 blocks [2/2] [UU]
 The sdb3 is failed, before to restore it you might should use fsck
or fsck.ext3(2) to fix partion like this:
 
#fsck /dev/sdb3
 
 Then use command to restore array 
 
 
#raidhotremove /dev/md2 /dev/sdb3
#raidhotadd /dev/md2 /dev/sdb3
#watch -n1 cat /proc/mdstat (you will see)
md2 : active raid1 hdb3[2] hda3[1]
119684160 blocks [2/1] [_U]
[>....................] recovery = 0.2% (250108/119684160)
finish=198.8min speed=10004K/sec
 
 after few minutes 
 
 
#cat /proc/mdstat
md1 : active raid1 sdb2[1] sda2[0]
20482752 blocks [2/2] [UU]
md2 : active raid1 sdb3[1] sda3[0]
20482752 blocks [2/2] [UU]
md3 : active raid1 sdb5[1] sda5[0]
10241280 blocks [2/2] [UU]
md4 : active raid1 sdb6[1] sda6[0]
10241280 blocks [2/2] [UU]
md5 : active raid1 sdb7[1] sda7[0]
5116544 blocks [2/2] [UU]
md0 : active raid1 sdb8[1] sda8[0]
2096384 blocks [2/2] [UU]
md6 : active raid1 sdb9[1] sda9[0]
2048192 blocks [2/2] [UU]
 
 It's success to restored 

Monday, August 18, 2008

Postfix Log Entry - Enabling PIX workaround

>I have a user who sends frequently to this server. Sometime I see the
>following message and sometimes I don't. What does it mean?
>
>Mar 28 17:10:10 smtp postfix/smtp[31836]: 6DE182FC0BE: enabling PIX
>. workaround for ms1.nypost.com[206.15.109.53]
>
>
>


It means that ms1.nypost.com[206.15.109.53] is running a Cisco PIX firewall between you and their email server, and they have it's FIXUP SMTP protocol enabled.

This means that your email server is really talking to their CISCO PIX, which is intercepting, interpreting, and relaying the SMTP protocol onward to their email relay.

The FIXUP SMTP protocol has a history of buggy code, and they probably shouldn't be enabling it.

Postfix can recognize a CISCO PIX running this protocol, and activate a workaround for it. It's just an info message being logged, and doesn't indicate a problem on your end.

Thursday, May 08, 2008

FreeBSD Disklabel

Freebsd中的disklabel可以动态的调整分区的空间容量,类似于linux中的LVM,是一款不错工具。
#disklabel -e /dev/disk

8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 1048576 0 4.2BSD 2048 16384 8
b: 4142736 1048576 4.2BSD 2048 16384 28344
c: 312576642 0 unused 0 0 # "raw" part, don't edit
d: 4167680 5191312 4.2BSD 2048 16384 28552
e: 1048576 9358992 4.2BSD 2048 16384 8
f: 41943040 10407568 4.2BSD 2048 16384 28552
g: 41943040 52350608 4.2BSD 2048 16384 28552
h: 91943040 144293648 4.2BSD 2048 16384 28552

容量的计算方式是:
offset的当前值=size+上一个offset

格式化文件系统:

#newfs -U -O 2 /dev/disk

个人感觉sysinstall没有直接命令方式好用,而且有时会有报错,说是无法保存更改的信息。

据说freebsd和linux中可以不创建分区而直接格式化文件系统,有人也做个这样的实验:
http://blog.chinaunix.net/u/12258/showart_481475.html

Saturday, May 03, 2008

Learn Oracle resource

今年年初刚刚深度接触oracle,发觉其安装,管理的复杂和功能的强大程度与MySQL和PostgreSQL根本不在一个档次上,不愧70%的数据库市场被它占领。由于还在初级阶段需要不断的了解和掌握oracle的管理优化技能,除了官方的之外,发觉网上的有些站点还是不错的,如:
http://www.eygle.com/index-tech.htm
http://www.orafaq.com/

Oracle-ksvcreate: Process(m000) creation failed

“五一”连续两天站点出现报错页面,发生的时间也比较有规律,查看了tomcat的日志,发现是数据库的链接错误,于是拨打oracle的DBA手机,关机,呵呵!没办法只能自己上马,查看alert日志,发现如下错误:

Process m000 died, see its trace file

Fri May 2 10:06:10 2008

ksvcreate: Process(m000) creation failed

Fri May 2 10:07:11 2008

Process m000 died, see its trace file

Fri May 2 10:07:11 2008

ksvcreate: Process(m000) creation failed

Fri May 2 10:08:12 2008

Process m000 died, see its trace file

Fri May 2 10:08:12 2008

ksvcreate: Process(m000) creation failed

Fri May 2 10:09:15 2008

Process m000 died, see its trace file

Fri May 2 10:09:15 2008

ksvcreate: Process(m000) creation failed

..............

似乎是process的问题,由于后台有4个tomcat节点,于是停掉了其中一台,站点回复正常,故障也不再出现,难道真的由于process的问题,导致了无法创建更多的process,这似乎不太可能,因为之前的状态一直是正常的。


于是Google了一下,有的说是process大小问题,有的说是10g的一个bug。

系统的process一直是150,没有变过,之前一直正常,没有修改过,至少据我了解的是这样,(不清楚DBA有没有修改过什么参数,这个还得去进一步了解。)

至于10g的bug似乎很多人都这样认同,metalink上似乎也是这么解释的:

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2 to 10.2This problem can occur on any platform.
Symptoms Switching a Physical Standby Database multiple to READ ONLY Mode will report the following Errors in the ALERT.LOG:

ksvcreate: Process(m000) creation failed

Changes Switch Physical Standby from READ ONLY to apply and back to READ ONLY.

Cause The Cause of this Problem has been identified in Bug 5583049.
Solution There are two Workarounds available:

Restart the Instance..
or
Disable ADDM - Should be re-enabled if Standby takes up the Primary Role* Set SGA_TARGET=0 and set shared_pool_size, db_cache_size, etc if using Automatic SGA Memory Management (ASMM)
* Set STATISTICS_LEVEL=BASIC to disable statistics gathering


http://zhang41082.itpub.net/post/7167/456170的作者称重启了实例问题就解决了!
于是尝试了操作,发现重启前后有些差异:

重启前:
SQL> show parameter processes;

NAME TYPE VALUE
------------------------------------ ----------- ------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 2
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150

重启后:
SQL> show parameter processes;

NAME TYPE VALUE
------------------------------------ ----------- ------------
_trace_flush_processes string ALL
_trace_processes string ALL
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 2
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150

不知道是不是_trace_flush_processes string 和 _trace_processes string的问题(Process m000 died, see its trace file)
重启后不知道效果如何,只能观察一段时间!